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(54) Title: STARCH MODIFICATION 

(57) Abstract: The present invention relates to a method of altering starch synthesis in a plant by modifying the starch priming 
activity of the plant. In particular, this is achieved by altering the expression or activity of a starch primer which is preferably encoded 
by the sequence of SEQ ID NO: 1 or a sequence substantially homologous thereto. Also provided are plants in which the starch 
priming activitv has been altered, and propagating material derived from such plants. 
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STARCH MODIFICATION 



This invention is based upon the identification of a protein, which initiates starch 
synthesis in a plant. In particular, the intention relates to plant glycogenin-like nucleic acid 
molecules, plant glycogenin-like gene products, antibodies to plant glycogenin-like gene 
products, plant glycogenin-like regulatory regions, vectors and expression vectors with plant 
glycogenin-like genes, cells, plants and plant parts with plant glycogenin-like genes, modified 
starch from such plants and the use of the foregoing to improve agronomically valuable 
plants. 

Starch, a branched polymer of glucose consisting of largely linear amylose and highly 
branched amylopectin, is the product of carbon fixation during photosynthesis in plants, and 
is the primary metabolic energy reserve stored in seeds and fruit. For example, up to 75% of 
the dry weight of grain in cereals is made up of starch. The importance of starch as a food 
source is reflected by the fact that two thirds of the world's food consumption (in terms of 
calories) is provided by the starch in grain crops such as wheat, rice and maize. 

Starch is the product of photosynthesis, and is analogous to the storage compound 
glycogen in eukaryotes. It is produced in the chloroplasts or amyloplasts of plant cells, these 
being the plastids of photosynthetic cells and non-photosynthetic cells, respectively. The 
biochemical pathway leading to the production of starch in leaves has been well 
characterised, and considerable progress has also been made in elucidating the pathway of 
starch biosynthesis in storage tissues. 

i ■ 

The biosynthesis of starch molecules is dependent on a complex interaction of 
numerous enzymes, including several essential enzymes such as ADP-Glucose, a series of 
starch synthases which use ADP glucose as a substrate for forming chains of glucose linked 
by alpha-1-4 linkages, and a series of starch branching enzymes that link sections of polymers 
with alpha-1-6 linkages to generate branched structures (Smith et al., 1995, Plant Physiology, 
1 07:673-677). Further modification of the starch by yet other enzymes, i.e. debranching 
enzymes or disproportionating enzymes, can be specific to certain species. 

The fine structure of starch.is a complex mixture of D-glucose polymers that consist 
essentially of linear chains (amylose) and branched chains (amylopectin) glucans. Typically, 
amylose makes up between 10 and 25% of plant starch, but varies significantly among 
species. Amylose is composed of linear D-glucose chains typically 250-670 glucose units in 
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length (Tester, 1997, in: Starch Structure and Functionality, Frazier et aL, eds., Royal Society 
of Chemistry, Cambridge, UK). The linear regions of amylopectin are composed of low 
molecular weight and high molecular weight chains, with the low ranging from 5 to 30 
glucose units and the high molecular weight chains from 30 to 100 or more. The 
amylose/amylopectin ratio and the distribution of low and high molecular weight D-glucose 
chains can affect starch granule properties such as gelatinization temperature, retrogradation, 
and viscosity (Blanshard, 1987). The characteristics of the fine structure of starch mentioned 
above have been examined at length and are well known in the art of starch chemistry. 

It is know that starch granule size and amylose percentage change during kernel 
development in maize and during tobacco leaf development (Boyer et al., 1976, Cereal Chem 
53:327-337). In his classic study Boyer et al. concluded the amylose percentage of starch 
decreases with decreasing granule size in later stages of maize kernel development. 

As mentioned above, glycogen serves as the glucose reserve in animals rather than 
starch. The biosynthesis of glycogen in eukaryotes involves chain elongation through the 
formation of linear alpha-1,4 glycosidic linkages catalysed by the enzyme, glycogen 
synthase. Evidence for a distinct initiation step involving a self-glucosylating protein, known 
as glycogenin or SGP, came from work directed at mammalian systems (Smythe et al, Eur. J. 
Biochem 200:625-631 (1990) and Whelan Bioessays 5:136-140 (1986)). 

Cheng et al (Mol. and Cell Biol. 15(12): 6632-6640 (1995)) report the identification 
of two yeast genes whose products are implicated in the biosynthesis of glycogen. The two 
genes, Glgl and Glg2 encode self-glucosylating proteins which in vitro act as primers for the 
elongation reaction catalysed by glycogen synthase. Disruption of both these genes results in 
the inability to synthesise glycogen, despite normal levels of glycogen synthase. Glycogenin 
homologues have been identified in Caenorhabditis elegans and humans (Mu et al, J. Biol. 
Chem. 272(44): 27589-27597(1997)). 

It is now well established that glycogen synthesis is initiated on the primer protein, 
glycogenin or SGP, which remains covalently attached to the resulting macromolecule. The 
initiation step is thought to involve glycogenin growing a covalently attached oligosaccharide 
primer linked via a unique carbohydrate-protein bond via the hydroxyl group of the Tyr 
residue, Tyr 194. Once this oligosaccharide chain on glycogenin has been extended 
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sufficiently glycogen synthase is able to catalyse elongation and, together with the branching 
enzyme, form the mature glycogen molecule (Rodriguez and Whelan, Biochem Biophy Res 
Comm, 1 32:829-836; Roach and Skurat, 1997, in Progress in Nucleic Acid Research and 
Molecular Biology p289-316, Academic Press). 

Previous workers have set out to determine whether a priming molecule, such as a self 
glucosylating protein, is responsible for the initiation of starch synthesis in plants. 
W094/04693 (Zeneca Ltd.) describes the purification of a putative starch priming protein 
molecule from maize endosperm, known as amylogenin, and isolation of a partial cDNA. The 
maize amylogenin showed no sequence homology with glycogenin and exhibited a novel 
glucose-protein bond (Singh et al, FEBS Letters 376: 61-64 (1995)). However, based upon 
the sequence homology and the reported properties of the maize protein, it has subsequently 
been shown that the sequence of the maize nucleic acid molecule reported above is 
homologous to a revcrsibly glycosylated polypeptide (RGP1 ) from pea (Dhugga et al, Proc. 
Natl.. Acad. Sci. USA 94:7679-7684 (1997)). RGP1 is localised to the Golgi apparatus and is 
thought to be involved in cell wall synthesis. This has dispelled the initial idea that the 
"amylogenin" molecule of W094/04693 is involved in starch synthesis. In further work 
(Langeveld, M.J. S et al. 2002 Plant Physiol, 129, pp 278-289) it is concluded that wheat and 
rice RGPs do not play a role in starch synthesis in a way similar to the functioning of 
glycogenin as a primer for glycogen synthesis. It is reported that RGP1 and RGP2 proteins in 
wheat and rice have different functions to glycogenin. 

Lightner et al. US 2002/0001843 described fragments of putative "corn (maize), 
wheat, and rice glycogenin and water stress proteins." Lightner et al. did not demonstrate the 
functionality of the fragments, but only their sequence homology to glycogenin from animals. 
To date, therefore, no one has identified and demonstrated a functional protein for starch 
initiation or starch priming in plants. 

Purified starch is used in numerous food and industrial applications and is the major 
source of carbohydrates in the human diet. Typically, starch is mixed with water and cooked 
to form a thickening agent or gel. Of central importance are the temperature at which the 
starch cooks, the viscosity that the agent or gel reaches, and the stability of the gel viscosity 
over time. The physical properties of unmodified starch limit its usefulness in many 
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applications. As a result, considerable effort and expenditure is allocated to chemically 
modify starch (i.e. cross-linking and substitution) in order to overcome the numerous 
limitations of unmodified starch and to expand industrial usefulness. Modified starches can 
be used in foods; paper, textiles, and adhesives. 

It is an object of the invention to provide novel isolated nucleic acid molecules and 
.isolated polypeptides, which novel molecules and polypeptides are able to provide modified 
starch properties in transgenically modified plants. 

The invention relates to a family of plant glycogenin-like genes, also referred to as 
starch primer genes. In various embodiments, the invention provides plant glycogenin-like 
nucleic acid molecules including, but not limited to, plant glycogenin-like genes; plant 
glycogenin-like regulatory regions; plant glycogenin-like promoters; and vectors 
incorporating sequences encoding plant glycogenin-like nucleic acid molecules of the 
invention. Also provided are plant glycogenin-like gene products, including, but not limited 
to, transcriptional products such as mRNAs, antisense and ribozyme molecules, and 
translational products such as the plant glycogenin-like protein, polypeptides, peptides and 
fusion proteins related thereto; genetically engineered host cells that contain any of the 
foregoing nucleic acid molecules and/or coding sequences or compliments, variants, or 
fragments thereof operatively associated with a regulatory element that directs the expression 
of the gene and/or coding sequences in the host cell; genetically-engineered plants derived 
from host cells; modified starch and starch granules produced by genetically-engineered host 
cells and plants; and the use of the foregoing to improve agronomically valuable plants. 

In the context of the present invention, a "starch primer" used interchangeably 
with "plant glycogenin-like protein" includes any protein which is capable of initiating starch 
production in a plant. By definition, the plant glycogenin-like protein will be of plant origin. 
Preferred fragments of plant glycogenin-like proteins are those which retain the ability to 

initiate starch synthesis. 

The invention is based upon the identification of a protein responsible for initiation of 
starch synthesis in plants, which despite continued efforts over the last few years, no one had 
yet successfully identified. In particular, the inventors have discovered nucleic acid 
molecules from Arabidopsis which have sequences that are homologous to the known 
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glycogenin genes of yeast and human. Analysis of one of this nucleic acid molecule indicates 
that it contains a sequence encoding a transit peptide for plastid localization of the gene 
producCconsistent with a role in starch synthesis, referred to herein as plant glycogenin-like 
starch initiation protein (PGSIP). Glycogenin-like genes from other plant species have been 
identified by analysis of sequence homology with the Arabidopsis sequences. The genes of 
the invention do not show homology to the amylogenin sequences or starch sequences of the 
prior art. 

Modulation of the initiation of starch synthesis allows various aspects of the 
biosynthetic process to be regulated. By altering aspects of the biosynthesis process such as 
temporal and spatial specificity, yield and storage, the carbohydrate profile of the plant may 
be altered in magnitude and directions that may be more favorable for nutritional or industrial 
uses. 

The present invention provides an isolated nucleic acid molecule that i) comprises a 
nucleotide sequence which encodes a polypeptide comprising the amino acid sequence of 
SEQ ID NO: 3, or a fragment thereof; ii) comprises a nucleotide sequence at least 40% 
identical to SEQ ID NOs: 1 or 2, or a complement thereof as determined using the BESTFIT 
or GAP programs with a gap weight of 50 and a length weight of 3; or iii) hybridizes to a 
nucleic acid molecule consisting of SEQ ID NOs: 1 or 2 under low stringency conditions of 
hybridization of washing at 60°C for 2x 15 minutes at 2 x SSC, 0.5x SDS, or a complement 
thereof. The present invention also provides an isolated nucleic acid molecule of the 
invention comprising SEQ ID NOs: 1 or 2 or a complement thereof. In an embodiment of the 
invention, an isolated nucleic acid molecule comprises a nucleotide sequence selected from 
the group consisting of nucleotide residues 516-592, 681-918, 1039-1655, 1762-2536 and 
2991-3264 of SEQ ID NO: 1. 

Another embodiment of the invention encompasses an isolated nucleic acid molecule 
of the invention that i) comprises a nucleotide sequence which encodes a polypeptide 
comprising the amino acid sequence of SEQ ID NO: 1 1, or a fragment thereof; ii) comprises 
a nucleotide sequence at least 70% identical to SEQ ID NO: 10, or a complement thereof as 
determined using the BESTFIT or GAP programs with a gap weight of 50 and a length 
weight of 3, wherein the nucleotide sequence does not encode an amino acid of SEQ ID NO: 
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35; or iii) hybridizes to a nucleic acid molecule consisting of SEQ ID NO: 10 under stringent 
conditions of hybridization, or a complement thereof, wherein the sequence does not encode 
an amino acid of SEQ ID NO: 35. In a related embodiment, the isolated nucleic acid 
molecule of the invention comprises SEQ ED NO: 10 or a complement thereof. In another 
related embodiment an isolated nucleic acid molecule of the invention comprises the amino 
acid sequence that is at least 98% identical to SEQ ID NO: 9 as determined using the 
. BESTFIT or GAP programs with a gap weight of 12 and a length weight of 4. The invention 
1 also encompasses an isolated nucleic acid molecule that comprises the nucleotide sequence of 
SEQ ID NO: 8 or a complement thereof. 

In an embodiment of the invention, an isolated nucleic acid molecule of the invention 
i) comprises a nucleotide sequence which encodes a polypeptide comprising the amino acid 
sequence of SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34, or a fragment 
thereof; ii) comprises a nucleotide sequence at least 70% identical to SEQ ID NOs: 4, 5, 6, 
12, 14, 16, 1 8, 20, 23, 25, 27, 29, 31, 33, or a complement thereof as determined using the 
BESTFIT or GAP programs with a gap weight of 50 and a length weight of 3; or iii) 
hybridizes to a nucleic acid molecule consisting of SEQ ID NOs: 4, 5, 6, 12, 1.4, 1 6, 1 8; 20, 
23, 25, 27, 29, 31, 33 under stringent conditions of hybridization, or a complement thereof. 
In a related embodiment, the isolated nucleic acid molecule of the invention comprises SEQ 
ID NOs: 4, 5, 6, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, 33, or a complement thereof. In 
another embodiment of the invention, a fragment of the isolated nucleic acid molecule of the 
invention comprises at least 40, 60, 80, 100 or 150 contiguous nucleotides of the nucleic acid 
molecule. In yet another embodiment, the isolated nucleic acid molecule of the invention 
comprises the nucleotide sequence of nucleotides 1-195 of SEQ ID NO: 2, or a complement 
thereof. 

According to one aspect of the invention, an isolated polypeptide of the invention 
comprises the amino acid sequence of amino acid residues 1 -65 of SEQ ID NO: 3, or a 
fragment thereof. In a related aspect, an isolated polypeptide comprises i) an amino acid 
sequence that is at least 70% identical to SEQ ED NO: 3 or a fragment thereof as determined 
using the BESTFIT or GAP programs with a gap weight of 12 and a length weight of 4; ii) an 
amino acid sequence encoded by the nucleic acid molecule of the invention; or iii) an amino 
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acid sequence of SEQ ID NO: 3. 

An embodiment of the invention encompasses an isolated polypeptide of the 
invention that comprises i) an amino acid sequence at least 70% identical to SEQ ID NO: 1 1 
as determined using the BESTFIT or GAP programs with a gap weight of 12 and a length 
weight of 4, or a fragment thereof; ii) an amino acid sequence encoded by the nucleic acid 
molecule of of the invention; or iii) an amino acid sequence of SEQ ID NO: 1 1 . 

In another embodiment of the invention, an isolated polypeptide of the invention 
comprises i) an amino acid sequence that is at least 98% identical to SEQ ID NO: 9 as 
determined using the BESTFIT or GAP programs with a gap weight of 12 and a length 
weight of 4; iii) an amino acid sequence encoded by the nucleic acid molecule of SEQ ID 
NO: S, or a complement thereof; or v) an amino acid sequence of SEQ ID NO: 9, or a 
fragment thereof. 

The invention further provides for an isolated polypeptide that comprises i) an amino 
acid sequence that is at least 70% identical to SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24, 26, 
28, 30, 32, 34, or a fragment thereof as determined using the BESTFIT or GAP programs 
with a gap weight of 12 and a length weight of 4; ii) an amino acid sequence encoded by the 
nucleic acid molecule of the invention; or iii) an amino acid sequence of SEQ ID NOs: 7, 13, 
15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34. In an embodiment of the invention, a fragment of a 
polypeptide of the invention comprises at least 5 amino acid residues, wherein *said fragment 
is a portion of the polypeptide encoded by a nucleic acid molecule selected from the group 
consisting of exon I, exon II, exon III, exon IV and exon V of SEQ ID NO: 1. 

Another embodiment of the invention encompasses the polypeptide of SEQ ID: 3, 7, 
9, 11, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34 further comprising one or more 
conservative amino acid substitution. In yet another embodiment, the invention provides for a 
fusion protein comprising the amino acid sequence of the invention and a heterologous 
protein. 

The invention provides for an isolated polypeptide fragment or immunogenic 
fragment that comprises at least 5, 8, 10, 15, 20, 25, 30 or 35 consecutive amino acids of a 
polypeptide according to the invention. The invention further provides for an antibody that 
immunospecifically binds to a polypeptide of the invention. 
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In one embodiment the invention encompasses a method for making a polypeptide of 
any one of the invention, comprising the steps of a) culturing a cell comprising a recombinant 
polynucleotide encoding a polypeptide of the invention under conditions that allow said 
polypeptide to be expressed by said cell; and b) recovering the expressed polypeptide. 

According to another aspect of the invention, the present invention provides a 
complex comprising a polypeptide encoded by a nucleic acid molecule of the invention and a 
starch molecule. In one embodiment of the complex of the invention, the starch molecule 
comprises from 1 to 700 glucose units. In another embodiment of the complex of the 
invention the starch molecule comprises branching chains of glucose polysaccharides. 

According to yet another aspect of the invention, the present invention provides a 
vector comprises a nucleic acid molecule of the invention. Alternatively, the present 
invention provides an expression vector comprises a nucleic acid molecule of the invention 
and at least one regulatory region operably linked to the nucleic acid molecule. 

Advantageously the expression vector of the invention comprises a regulatory region 
that confers chemically-inducible, dark-inducible, developmental^ regulated, developmental- 
stage specific, wound-induced, environmental factor-regulated, organ-specific, cell-specific, 
and/or tissue-specific expression of the nucleic acid molecule or constitutive expression of 
the nucleic acid molecule of the invention. Advantageously the expression vector of the 
invention comprises a regulatory region selected from the group consisting of a 35S GaMV 
promoter, a rice actin promoter, a patatin promoter and a high molecular weight glutenin gene 
of wheat. In another embodiment, an expression vector of the invention comprises the 
antisense sequence of a nucleic acid molecule of the invention, wherein the antisense 
sequence is operably linked to at least one regulatory region. 

The invention also provides for a genetically-engineered cell which comprises a 
nucleic acid molecule of the invention. In one embodiment, a cell comprises the expression 
vector of the invention comprising a nucleic acid molecule of the invention and at least one 
regulatory region operably linked to the nucleic acid molecule. In another embodiment, a cell 
■ comprises the expression vector of the invention comprising the antisense sequence of 

nucleic acid molecules of the invention/wherein the antisense sequence is operably linked to 
at least one regulatory region. 
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Yet another aspect of the invention provides a genetically-engineered plant 
comprising the isolated nucleic acid molecule of the invention. The invention also provides a 
genetically-engineered plant comprising an isolated nucleic acid molecule of the invention 
and progeny thereof, and further comprising a transgene encoding an antisense nucleotide 
sequence. The invention also provides for a genetically-engineered plant comprising an 
isolated nucleic acid molecule of the invention, and further comprising an RNA interference 
construct. 

An embodiment of the invention encompasses a cell comprising a 35SCaMV 
constitutive promoter operably linked to a nucleic acid molecule of the invention, fragments 
thereof, or the nucleic acid molecule of SEQ ID NO:2 or a rice actin promoter operably 
linked to an KNA interference construct comprising a nucelic acid molecule of the invention, 
fragments thereof, or fragments of a nucleic acid molecule of SEQ ED NO:2. 

Another aspect of the invention provides a method of altering starch synthesis in a 
plant comprising, introducing into a plant an expression vector of the invention, such that 
starch synthesis is altered relative to a plant without the expression vector. Yet another 
embodiment of the invention provides a method of altering starch synthesis in a plant 
comprising, introducing into a plant at least an expression vector comprising the antisense 
sequence of a nucleic acid molecules of the invention, wherein the antisense sequence is 
operably linked to at least one regulatory region, such that starch synthesis is altered in 
comparison to a plant without the expression vector. 

In another aspect of the invention, the present invention provides a method of altering 
starch granules in a plant comprises introducing into a plant at least an expression vector 
comprising a nucleic acid molecule of the invention and at least one regulatory region 
operably linked to the nucleic acid molecule, such that the starch granules are altered in 
comparison to a plant without the expression vector. 

Advantageously the present invention provides a method of altering starch granules in 
a plant comprises introducing into a plant at least an expression vector of Claim 30??check, 
such that the starch granules are altered in comparison to a plant without the expression 
vector. 

The invention further provides a method of altering starch granules in a plant 
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comprises introducing into a plant at least an expression vector comprising a nucleic acid 
molecule of the invention and at least one regulatory region operably linked to the nucleic 
acid molecule, such that the starch granules are absent from leaves of the plant comprising at 

least an expression vector. 

In a preferred embodiment of the invention, a plant part comprises a nucleic acid 
molecule of the invention resulting in an alteration in starch synthesis. In another preferred 
embodiment the plant part is a tuber, seed, or leaf. 

The invention also provides for the modified starch obtained from the plant parts of 
the invention, wherein the modification is selected from the group consisting of a ratio of 
amylose to amylopectin, amylose content, size of starch granules, quantity of size of starch 
granules, a ratio of small to large starch granules, and rheological properties of the starch as 
measured using viscometric analysis. 

The present invention will now be illustrated by way of non-limiting examples, with 
reference to the sequence identifiers and Figures in which: 

SEQ ID NO: 1 shows the genomic sequence of a starch primer gene isolated from Arabidopsis 
thaliana referred to herein as plant glycogenin-like starch initiation protein (PGSIP), 
at3gl 8660, GenBank Accession No. NM_1 12752. The gene includes part of the promoter 
region, where the putative TATA and CAAT box are located at nucleotides 424-428 and 373- 
376 respectively. The exons are located at nucleotides 516-592, 681-918, 1039-1655, 1762- 
2536 and 2991-3264. 

SEQ ID NO: 2 shows the. deduced cDNA sequence of Arabidopsis thaliana PGSIP with 

protein translation. The transit peptide is located at nucleotides 1-195. 

SEQ ID NO:3 shows the amino acid sequence representing the Arabidopsis thaliana PGSIP 

protein. The predicted transit peptide is located at amino acid residues 1 -65 . 

SEQ ID NO:4 shows the nucleotide sequence of the maize EST of GenBank Accession No. 

BF729544 with homology to the Arabidopsis thaliana PGSIP gene. The nucleotide sequence 

with homology to the Arabidopsis thaliana PGSIP gene is located at nucleotides 1-557. 

SEQ ID NO:5 shows the nucleotide sequence of the maize EST BG837930 with homology to 

Arabidopsis thaliana PGSIP gene. The nucleotide sequence" with homology to the 

Arabidopsis thaliana PGSIP gene is located at nucleotides 1-726. 
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SEQ ID NO:6 shows the deduced cDNA of the Arabidopsis glycogenin-like gene 

(atl g77 1 30) with protein translation. The protein sequence with homology to a small region 

(amino acid residues 1023-1146) of dulll gene from maize (064923). 

SEQ ID NO:7 shows the amino acid sequence of atlg77130. 

SEQ ID NO:8 shows the deduced cDNA of the Arabidopsis glycogenin-like gene 

(atlg08990) GenBank Accession No. NM_1 00770 with protein translation. 

SEQ ID NO:9 shows the amino acid sequence of atlg08990. 

SEQ ID NO: 10 shows the deduced cDNA of the Arabidopsis glycogenin-like gene 

(atlg54940) GenBank Accession No. NM_1 04367 with protein translation. 

SEQ ID NO:l 1 shows the amino acid sequence of atlg54940. 

SEQ ID NO: 12 shows the deduced cDNA of the Arabidopsis glycogenin-like gene 

(at4g33330) GenBank Accession No. NM_1 19487 with protein translation. 

SEQ ID NO: 13 shows the amino acid sequence of at4g33330. 

SEQ ID NO: 14 shows the deduced cDNA of the Arabidopsis glycogenin-like gene 

(at4g33340) GenBank Accession No. NM_1 19488 with protein translation. 

SEQ ID NO: 15 shows the amino acid sequence of at4g33340. 

SEQ ID No.16 shows the nucleotide sequence of Barley EST Seql. 

SEQ ID NO: 1 7 shows the amino acid sequence of Barley EST Seql . 

SEQ ID NO: 18 shows the nucleotide sequence of Barley EST Seq2. 

SEQ ID NO: 19 shows the amino acid sequence of Barley EST Seq2. 

SEQ ID NO:20 shows the nucleotide sequence of a wheat EST. 

SEQ ID NO:21 shows the first half of the amino acid sequence of the wheat EST. 

SEQ ID NO:22 shows the second half of the amino acid sequence of the wheat EST. 

SEQ ID NO:23 shows the deduced cDNA of the Arabidopsis gene EMBL:AY062695 

GenBank Accession No. AY062695 with homology to the Arabidopsis PGSIP gene with 

protein translation. 

SEQ ID NO:24 shows the amino acid sequence of EMBL:AY062695. 

SEQ ID NO:25 shows the deduced cDNA of the Rice gene SPTrEMBL:Q94HG3 GenBank 

Accession No. AC079633 with homology to the Arabidopsis PGSIP gene with protein 

translation. 
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SEQ ID NO:26 shows the amino acid sequence of SPTrEMBL:Q94HG3. 

SEQ IDNO:27 shows the nucleotide sequence of Maize EST Seql. 

SEQ ID NO:28 shows the amino acid sequence of Maize EST Seql . 

SEQ ID NO: 29 shows the nucleotide sequence of Maize EST Seq2. 

SEQ ID NO:30 shows the amino acid sequence of Maize EST Seq2. 

SEQ ID NO: 3 1 shows the nucleotide sequence of Maize EST Seq3. 

SEQ ID NO:32 shows the amino acid sequence of Maize EST Seq3. 

SEQ ID NO:33 shows the nucleotide sequence of Maize EST Seq4. 

SEQ ID NO: 3-4 shows the amino acid sequence of Maize EST Seq4. 

SEQ ID NO: 35 shows an amino acid sequence as a result of a conceptual translation of a 

portion of a genomic clone from Arabidopsis thaliana as it appears in US Patent Application 

No. 2002/0001 S4 3. 

Figure 1 shows the plasmid containing the Arabidopsis thaliana plant glycogenin-like starch 

initiation protein (PGS1P) gene. 

Figure 2 shows the plasmid map forpTPYES. 

Figure 3 shows the plasmid map for pNTPYES 

Figure 4A shows a genomic region containing AT3gl 8660 (PGSIP); 4B shows a non- 
radioactive southern blot of Arabidopsis, wheat and maize genomic DNA probed with C- 
terminus AT3gl 8660 cDN A under high stringency conditions. N-Ncol, A-Aval, C-Clal. The 
probe used for the blot of Figure 4B is also shown. 

Figure 5A shows a non-radioactive southern blot of Arabidopsis, wheat and maize genomic 
DNA probed with N-terminal ATgl 8660 (PGSIP) cDNA fragment under low stringency 
conditions. N-Ncol, A-Aval, C-Clal. Lane M is a marker, lane 1 is AT (EcoRI), lane 2 is AT 
(Xhol), lane 3 is AT (EcoRV), lane 4 is wheat (EcoRI), lane 5 is wheat (Xhol), lane 6 is 
wheat EcoRV), lane 7 is maize (EcorRI), lane 8 is maize (Xhol), and lane 9 is maize 
(EcoRV); 5B shows a non-radioactive southern blot of Arabidopsis, wheat and maize 
genomic DNA probed with C-terminal ATgl 8660 (PGSIP)-eDNA fragment under low 
stringency conditions. N-Ncol, A-Aval, C-Clal. Lane M is a marker, lane 1 is AT (EcoRI), 
lane 2 is AT (Xhol), lane 3 is AT (EcoRV), lane 4 is wheat (EcoRI), lane 5 is wheat (Xhol), 
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lane 6 is wheat EcoRV), lane 7 is maize (EcorRI), lane 8 is maize (Xhol), and lane 9 is maize 
(EcoRV): 5C shows the N-terminal and C-teiminal region of the PGSIP cDNA used to probe 
the blots of 5 A and 5B. 

Figure 6 shows the cloning strategy and plasmid maps for the production of the PGSIP RNAi 
construct pCL76 SCV. 

Figure 7 shows the plasmid map for pCL68 SCV. (Sense expression construct) containing the 
AT3gl8660 (PGSIP) cDNA. 

Figure 8 shows the plasmid map for pCL76 SCV.(RNAi construct) containing fragments of 
the AT3g 18660 (PGSIP) cDNA. 

Figure 9 shows the plasmid map for pMC177 (Sense expression construct) containing the 
AT3gl 8660 (PGSIP) under rice actin promoter used in barley and Arabidopsis 
transformation. 

Figure 10 shows the plasmid map for pMC176 (RNAi construct) containing the AT3gl8660 
(PGSIP) under rice actin promoter used in barley and Arabidopsis transformation. 
Figure 1 1 A shows the results of iodine staining of leaves of barley which was shown to be 
PCR positive for the (pCL76 SCV) RNAi PGSIP constructs. Starch grains are absent; 1 IB 
shows the results of iodine staining of leaves of barley which was shown to be PCR negative 
for the (pCL76 SCV) RNAi PGSIP constructs. Starch grains are visible. 

For purposes of clarity, and not by way of limitation, the invention is 
described in the subsections below in terms of (a) plant glycogenin-like nucleic acid 
molecules; (b) plant glycogenin-like gene products; (c) transgenic plants that ectopically 
express plant glycogenin-like protein; (d); transgenic plants in which endogenous plant 
glycogenin-like protein expression is suppressed; (e) starch characterized by altered structure 
and physical properties produced by the methods of the invention. 

1.0 PLANT GLYCOGENIN-LIKE NUCLEIC ACIDS 
The nucleic acid molecules of the invention may be DNA, RNA and comprises the 
nucleotide sequences of a plant glycogenin-like gene, or fragments or variants thereof. A 
polynucleotide is intended to include DNA molecules (e.g.,cDNA, genomic DNA), RNA 
molecules (e.g., hnRNA, pre-mRNA, mRNA, double-stranded RNA), and DNA or RNA 
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analogs generated using nucleotide analogs. The polynucleotide can be single-stranded or 
double-stranded. 

The nucleic acid molecules are characterized by their homology to known glycogen 
primer (glycogenin) genes, such as those from yeast (Glgl and Glg2), human (any isoform), 
C. elegans, rat or rabbit, or plant glycogenin-like gene such as those defined herein. A 
preferred nucleic acid molecule of this embodiment is one that encodes the amino acid 
sequence of SEQ ID NO: 2, or a fragment or variant thereof, or a nucleic acid molecule 
comprising a sequence substantially similar to SEQ ID NO: 2. In a most preferred 
embodiment, the nucleic acid molecule comprises the nucleotide sequence shown in SEQ ID 
NO: 1 , or a fragment or variant thereof, or a sequence substantially similar to SEQ ID NO: 1 . 
The variants may be an allelic variants. Allelic variants being multiple forms of a particular 
gene or protein encoded by a particular gene. Fragments of a plant glycogenin-like gene may 
include regulatory elements of the gene such as promoters, enhancers, transcription factor 
binding sites, and/or segments of a coding sequence for example, a conserved domain, exon, 
or transit peptide. 

In a preferred embodiment, the nucleic acid molecules of the invention are comprised 
of full length sequences in that they encode an entire plant glycogenin-like protein as it 
occurs in nature. Examples of such sequences include SEQ ID NOs: 1, 2, 6, 8, 10, 12, and 
14. The corresponding amino acid sequences of full length glycogenin-like proteins are SEQ 

ID NOs: 3, 7, 9, 11, 13, and ,15. 

In an alternative embodiment, the nucleic acid molecules of the invention comprise a 
nucleotide sequence of SEQ ID NOs: 1, 2, 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, 
or 33. 

The nucleic acid molecules and their variants can be identified by several approaches 
including but not limited to analysis of sequence similarity and hybridization assays. 

In the context of the present invention the term "substantially homologous," 
"substantially identical," or "substantial similarity," when used herein with respect to 
sequences of nucleic acid molecules, means that the sequence has either at least 45% 
sequence identity with the reference sequence, preferably 50% sequence identity, more 
preferably at least 60%, 70%, 80%, 90% and most preferably at least 95% sequence identity 
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with said sequences, in some cases the sequence identity may be 98% or more preferably 
99%, or above, or the term means that the nucleic acid molecule is either is capable of 
hybridizing to the complement of the nucleic acid molecule having the reference sequence 
under stringent conditions. 

"% identity", as known in the art, is a measure of the relationship between two 
polynucleotides or two polypeptides, as determined by comparing their sequences. In 
general, the two sequences to be compared are aligned to give a maximum correlation 
between the sequences. The alignment of the two sequences is examined and the number of 
positions giving an exact amino acid or nucleotide correspondence between the two 
sequences determined, divided by the total length of the alignment and multiplied by 100 to 
give a % identity figure. This % identity figure may be determined over the whole length of 
the sequences to be compared, which is particularly suitable for sequences of the same or 
very similar length and which are highly homologous, or over shorter defined lengths, which 
is more suitable for sequences of unequal length or which have a lower level of homology. 

For example, sequences can be aligned with the software clustalw under Unix which 
generates a file with a ".aln" extension, this file can then be imported into the Bioedit 
program (Hall, T.A. 1999. BioEdit: a user-friendly biological sequence alignment editor and 
analysis program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41 :95-98) which opens 
the .aln file. In the Bioedit window, one can choose individual sequences (two at a time) and 
alignment them. This method allows for comparison of the entire sequences. 

Methods for comparing the identity of two or more sequences are well known in the 
art. Thus for instance, programs available in the Wisconsin Sequence Analysis Package, 
version 9.1 (Devereux J et al, Nucleic Acids Res. 12:387-395, 1984, available from Genetics 
Computer Group, Maidson, Wisconsin, USA). The determination of percent identity 
between two sequences can be accomplished using a mathematical algorithm. For example, 
the programs BESTFIT and GAP, may be used to determine the % identity between two 
polynucleotides and the % identity between two polypeptide sequences. BESTFIT uses the 
"local homology" algorithm of Smith and Waterman (Advances in Applied Mathematics, 
2:482-489, 1981) and finds the best single region of similarity between two sequences. 
BESTFIT is more suited to comparing two polynucleotide or two polypeptide sequences 
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which are dissimilar in length, the program assuming that the shorter sequence represents a 
portion of the longer. In comparison, GAP aligns two sequences finding a "maximum 
similarity" according to the algorithm of Neddleman and Wunsch (J. Mol. Biol. 48:443-354, 
1970) GAP is more suited to comparing sequences which are approximately the same length 
and an alignment is expected over the entire length. Preferably the parameters "Gap Weight" 
and "Length Weight" used in each program are 50 and 3 for polynucleotides and 12 and 4 for 
polypeptides, respectively. Preferably % identities and similarities are determined when the 
two sequences being compared are optimally aligned. 

Other programs for determining identity and/or similarity between sequences are also 
known in the art, for instance the BLAST family of programs (Karlin & Altschul, 1990, Proc. 
Natl. Acad. Sci. USA 87:2264-2268, modified as in Karlin & Altschul, 1993, Proc. Natl. 
Acad. Sci. USA 90:5873-5877, available from the National Center for Biotechnology 
information (NCB), Bethesda, Maryland, USA and accessible through the home page of the 
NCBI at wwwjicbi plm nih.govL These programs exemplify a preferred, non-limiting 
example of a mathematical algorithm utilized for the comparison of two sequences. Such an 
algorithm is incorporated into the BLASTN and BLASTX programs of Altschul, et al., 1990, 
j MoL BioL 215:403-410. BLAST nucleotide searches can be performed with the BLASTN 
program, score = 100, wordlength = 12 to obtain nucleotide sequences homologous to a 
nucleic acid molecules of the invention. BLAST protein searches can be performed with the 
XBLAST program, score = 50, wordlength = 3 to obtain amino acid sequences homologous 
to a protein molecules of the invention. To obtain gapped alignments for comparison 
purposes, Gapped BLAST can be utilized as described in Altschul et al., 1997, Nucleic Acids 
Res. 25:3389-3402. Alternatively, PSI-Blast can be used to perform an iterated search which 
detects distant relationships between molecules (Id.). When utilizing BLAST, Gapped 
BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., 
BLASTX and BLASTN) can be used. See http://www.ncbi.nlm.nih.gov. Another preferred, 
non-limiting example of a mathematical algorithm utilized for the comparison of sequences is 
the algorithm of Myers and Miller, 1988, CABIOS 4:1 1-17. Such an algorithm is 
incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence 
alignment software package. When utilizing the ALIGN program for comparing amino acid 
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sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 
can be used. 

Another non-limiting example of a program for determining identity and/or similarity 
between sequences known in the art is FASTA (Pearson W.R. and Lipman D J., Proc. Nat. 
Acac. Sci., USA, 85:2444-2448, 1988, available as part of the Wisconsin Sequence Analysis 
Package). Preferably the BLOSUM62 amino acid substitution matrix (Henikoff S. and 
Henikoff J.G., Proc. Nat. Acad. Sci., USA, 89:10915-10919, 1992) is used in polypeptide 
sequence comparisons including where nucleotide sequences are first translated into amino 
acid sequences before comparison. 

Yet another non- limi ting example of a program known in the art for determining 
identity and/or similarity between amino acid sequences is SeqWeb Software (a web-based 
interface to the GCG Wisconsin Package: Gap program) which is utilized with the default 
algorithm and parameter settings of the program: blosum62, gap weight 8, length weight 2. 

The percent identity between two sequences can be determined using techniques 
similar to those described above, with or without allowing gaps. In calculating percent 
identity, typically exact matches are counted. 

Preferably the program BESTFIT is used to determine the % identity of a query 
polynucleotide or a polypeptide sequence with respect to a polynucleotide or a polypeptide 
sequence of the present invention, the query and the reference sequence being optimally 
aligned and the parameters of the program set at the default value. 

Alternatively, variants and fragments of the nucleic acid molecules of the invention 
can be identified by hybridization to SEQ ID NOs: 1, 2 3 4-6, 8, 10 5 12, 14, 16, 18, 20, 23, 25, 
27, 29, 31, or 33. In the context of the present invention "stringent conditions" are defined as 
those given in Martin et al (EMBO J 4:1625-1630 (1985)) and Davies et al (Methods in 
Molecular Biology Vol 28: Protocols for nucleic acid analysis by non-radioactive probes, 
Isaac, P.G. (ed), Humana Press Inc., Totowa N J, USA)). Hybridization was carried out 
overnight at 65°C (high stringency conditions) or 55°C (low stringency conditions). The 
filters were washed for 2 x 15 minutes with 0.1 x SSC, 0.5 x SDS at 65°C (high stringency 
washing). For low — — — 
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stringency washing, the filters were washed at 60°C for 2x 15 minutes at 2 x SSC, 0.5x SDS. 

In instances wherein the nucleic acid molecules are oligonucleotides ("oligos"), highly 
stringent conditions may refer, e.g., to washing in 6xSSC / 0.05% sodium pyrophosphate at 
37°C (for 14-base oligos), 48°C (for 17-base oligos), 55°C (for 20-base oligos), and 60°C (for 
23 -base oligos). These nucleic acid molecules may act as plant glycogenin-like gene 
antisense molecules, useful, for example, in plant glycogenin-like gene regulation and/or as 
antisense primers in amplification reactions of plant glycogenin-like gene and/or nucleic acid 
molecules. Further, such nucleic acid molecules may be used as part of ribozyme and/or 
triple helix sequences, also useful for plant glycogenin-like gene regulation. Still further, 
such molecules may be used as components in probing methods whereby the presence of a 
plant glycogenin-like allele may be detected. 

In one embodiment, a nucleic acid molecule of the invention may be used to identify 
other plant glycogenin-like genes by identifying homologs. This procedure may be 
performed using standard techniques known in the art, for example screening of a cDNA 
library by probing; amplification of candidate nucleic acid molecules; complementation 
analysis, and yeast two-hybrid system (Fields and Song Nature 340 245-246 (1989); Green 
and Hannah Plant Cell 10 1295-1306 (1998)). 

The invention also includes nucleic acid molecules, preferably DNA molecules, that 
are amplified using the polymerase chain reaction and that encode a gene product 
functionally equivalent to a plant glycogenin-like gene product. 

In another embodiment of the invention, nucleic acid molecules which hybridize 
under stringent conditions to the nucleic acid molecules comprising a plant glycogenin-like 
gene and its complement are used in altering starch synthesis in a plant. Such nucleic acid 
molecules may hybridize to any part of a plant glycogenin-like gene, including the regulatory 
elements. Preferred nucleic acid molecules are those which hybridize under stringent 
conditions to a nucleic acid molecule comprising the nucleotide sequence encoding the amino 
acid sequence of SE ED NO: 2, and/or a nucleotide sequence of any one of SEQ ID NOs: 1, 2, 
4-6, 8, 10, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, or 33 or their complement sequences. 
Preferably the nucleic acid molecule which hybridizes under stringent conditions to a nucleic 
acid molecule comprising the sequence of a plant glycogenin-like gene or its complement are 
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complementary to the nucleic acid molecule to which they hybridize. 

In another embodiment of the invention, nucleic acid molecules which hybridize 
under stringent conditions to the nucleic acid molecules of SEQ ID NOs: 1, 2, 4-6, 8, 10, 12, 
14, 16, 18, 20, 23, 25, 27, 29, 31, or 33 hybridize over the full length of the sequences of the 

nucleic acid molecules. 

Alternatively, nucleic acid molecules of the invention or their expression products 
may be used in screening for agents which alter the activity of a plant glycogenin-like protein 
of a plant. Such a screen will typically comprise contacting a putative agent with a nucleic 
acid molecule of the invention or expression product thereof and monitoring the reaction 
there between. The reaction may be monitored by expression of a reporter gene operably 
linked to a nucleic acid molecule of the invention, or by binding assays which will be known 

to persons skilled in the art. 

Fragments of a plant glycogenin-like nucleic acid molecule of the invention 
preferably comprise or consist of at least 40 continuous or consecutive nucleotides of the 
plant glycogenin-like nucleic acid molecule of the invention, more preferably at least 60 
nucleotides, at least 80 nucleotides, or most preferably at least 100 or 150 nucleotides in 
length. Fragments of a plant glycogenin-like nucleic acid molecule of the invention 
encompassed by the invention may include elements involved in regulating expression of the 
gene or may encode functional plant glycogenin-like proteins. Fragments of the nucleic acid 
molecules of the invention, encompasses fragments of SEQ ID NOs: 1, 2, 4-6, S, 10, 12, 14, 
16, 18, 20, 23, 25, 27, 29, 31 and 33 as well as fragments of the variants of those sequences 
identified as defined above by percent homology or hybridization. 

Examples of fragments encompassed by the invention include exons of the PGSEP 
gene. SEQ ID NO: 1 indicates exon and intron boundaries of the plant glycogenin-like gene 
PGSIP. Nucleic acid molecules comprising PGSIP exon and intron sequences are 
encompassed by the present invention. In one embodiment, five exons are included (SEQ ID 
NO:l; GenBank Accession No. NM_1 12752). PGSIP exon 1 encompasses nucleotides 516- 
592 of SEQ ID NO: 1 . of the sequence shown in SEQ ID NO:l; exon 2 encompasses 
nucleotides 681 to 918 of the sequence shown in SEQ ID NO: 1; exon 3 encompasses 
nucleotides 1 039 to 1 655 of the sequence shown in SEQ ID NO: 1 ; exon 4 encompasses 



WO 03/014365 



PCT/GB02/03636 



20 



nucleotides 1762 to 2536 of the sequence shown in SEQ ID NO: 1; exon 5 encompasses 
nucleotides 2991 to 3264 of the sequence shown in SEQ ID NO:l. 

Further, a plant glycogenin-like nucleic acid molecule of the invention can comprise 
two or more of any above-described sequences, or variants thereof, linked together to form a 
larger subsequence. 

The nucleic acid molecules of the invention can comprise or consist of an EST 
sequence. The EST nucleic acid molecules of the invention can be used as probes for cloning 
corresponding full length genes. For example, the barley EST of SEQ ID NO: 16 can be 
utilized as a probe in identifying and cloning the full length Barley homolog of the 
Arabidopsis PGSIP gene. The EST nucleic acid molecules of the invention may be used as 
sequence probes in connection with computer software to search databases, such as GenBank 
for homologous sequences. Alternatively, the EST nucleic acid molecules can be used as 
probes in hybridization reactions as described herein. The EST nucleic acid molecules of the 
invention can also be used as molecular markers to map chromosome regions. 

In certain embodiments, the plant glycogenin-like nucleic acid molecules and 
polypeptides do not include sequences consisting of those sequences known in the art. For 
example, in one embodiment, the plant glycogenin-like nucleic acid molecules do not include 
EST sequences. 

In other embodiments, the plant glycogenin-like nucleic acid molecules of the 
invention, encode polypeptides that function as plant glycogenin-like proteins. The 
functionality of such nucleic acid molecules can be assessed using the yeast hybrid 
complementation assay as described herein in Example 3. Alternatively, the functionality of 
such nucleic acid molecules can be assessed using a complementation assay in Arabidopsis as 
described in this section. 

An isolated nucleic acid molecule encoding a variant protein can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the plant 
glycogenin-like nucleic acid molecule, such that one or more amino acid substitutions, 
additions or deletions are introduced into the encoded protein. Mutations can be introduced 
by standard techniques, such as, ethyl methane sulfonate, X-rays, gamma rays, T-DNA 
mutagenesis, or site-directed mutagenesis, PCR-mediated mutagenesis. Briefly, PCR primers 
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are designed that delete the trinucleotide codon of the amino acid to be changed and replace it 
with the trinucleotide codon of the amino acid to be included. This primer is used in the PCR 
amplification of DNA encoding the protein of interest. This fragment is then isolated and 
inserted into the full length cDNA encoding the protein of interest and expressed 
recombinantly. 

An isolated nucleic acid molecule encoding a variant protein can be created by any of 
the methods described in section 1.1. Either conservative or non-conservative amino acid 
substitutions can be made at one or more amino acid residues. Both conservative and non- 
conservative substitutions can be made. Conservative replacements are those that take place 
within a family of amino acids that are related in their side chains. Genetically encoded 
amino acids are can be divided into four families: (1) acidic = aspartate, glutamate; (2) basic 
= lysine, arginine, histidine; (3) nonpolar = alanine, valine, leucine, isoleucine, proline, 
phenylalanine; methionine, tryptophan; and (4) uncharged polar = glycine, asparagine, 
glutamine, cysteine, serine, threonine, tyrosine. In similar fashion, the amino acid repertoire 
can be grouped as (1) acidic = aspartate, glutamate; (2) basic = lysine, arginine histidine, (3) 
aliphatic = glycine, alanine, valine, leucine, isoleucine, serine, threonine, with serine and 
threonine optionally be grouped separately as aliphatic-hydroxyl; (4) aromatic = 
phenylalanine, tyrosine, tryptophan; (5) amide = asparagine, glutamine; and (6) sulfur - 
containing = cysteine and methionine. (See, for example, Biochemistry, 4th ed., Ed. by L. 
Stryer, WH Freeman and Co.: 1995). 

Alternatively, mutations can be introduced randomly along all or part of the coding 
sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 
biological activity to identify mutants that retain activity. Following mutagenesis, the 
encoded protein can be expressed recombinantly and the activity of the protein can be 
determined. 

The invention also encompasses (a) DNA vectors that contain any of the foregoing 
nucleic acids and/or coding sequences (i.e. fragments and variants) and/or their complements 
(i.e., antisense molecules); (b) DNA expression vectors that contain any of the foregoing 
nucleic acids and/or coding sequences operatively associated with a regulatory region that 
directs the expression of the nucleic acids and/or coding sequences; and (c) genetically 



WO 03/0J4365 PCT/GB02/03636 

22 

engineered host cells that contain any of the foregoing nucleic acids and/or coding sequences 
operatively associated with a regulatory region that directs the expression of the gene and/or 
coding sequences in the host cell. As used herein, regulatory region include, but are not 
limited to, inducible and non-inducible genetic elements known to those skilled in the art that 
drive and regulate expression of a nucleic acid. The nucleic acid molecules of the invention 
may be under the control of a promoter, enhancer, operator, cis-acting sequences, or trans- 
acting factors, or other regulatory sequence. The nucleic acid molecules encoding regulatory 
regions of the invention may also be functional fragments of a promoter or enhancer. The 
nucleic acid molecules encoding a regulatory region is preferably one which will target 
expression to desired cells, tissues, or developmental stages. 

Examples of highly suitable nucleic acid molecules encoding regulatory regions are 
endosperm specific promoters, such as that of the high molecular weight glutenin (HMWG) 
gene of wheat, prolamin, or ITR1 , or other suitable promoters available to the skilled person 
such as gliadin, branching enzyme, ADFG pyrophosphorylase, patatin, starch synthase, rice 

actin, and actin, for example. 

Other suitable promoters include the stem organ specific promoter gSPO-A, the seed 
specific promoters Napin, KTI 1, 2, & 3, beta-conglycinin, beta-phaseolin, heliathin, 
phytohemaglutinin, legumin, zein, lectin, leghemoglobin c3, ABI3, PvAlf, SH-EP, EP-C1, 

2S1,EM l,andROM2. 

Constitutive promoters, such as CaMV promoters, including CaMV 35S and CaMV 
19S may also be suitable. Other examples of constitutive promoters include Actin 1, 

Ubiquitin 1, and HMG2. 

In addition, the regulatory region of the invention may be one which is environmental 
factor-regulated such as promoters that respond to heat, cold, mechanical stress, light, ultra- 
violet light, drought, salt and pathogen attack. The regulatory region of the invention may 
also be one which is a hormone-regulated promoter that induces gene expression in response 
to phytohormones at different stages of plant growth. Useful inducible promoters include, 
but are not limited to, the promoters of ribulose bisphosphate carboxylase (RUB1SCO) genes, 
chlorophyll a/b binding protein (CAB) genes, heat shock genes, the defense responsive gene 
(e.g., phenylalanine ammonia lyase genes), wound induced genes (e.g., hydroxyproline rich 
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cell wall protein genes), chemically-inducible genes (e.g., nitrate reductase genes, gluconase 
genes, chitinase genes, PR-1 genes etc.), dark-inducible genes (e.g., asparagine synthetase 
gene as described by U.S. Patent 5,256,558), and developmental-stage specific genes (e.g., 
Shoot Meristemless gene, ABI3 promoter and the 2S1 and Em 1 promoters for seed 
development (Devic et al.,1996, Plant Journal 9(2):205-215), and the kinl and cor6.6 
promoters for seed development (Wang et al., 1995, Plant Molecular Biology, 28(4):619- 
634). Examples of other inducible promoters and developmental-stage specific promoters 
can be found in Datla et al., in particular in Table 1 of that publication (Datla-et al., 1997, 
Biotechnology annual review 3:269-296). 

A vector of the invention may also contain a sequence encoding a transit peptide 
which can be fused in-frame such that it is expressed as a fusion protein. 

Methods which are well known to those skilled in the art can be used to construct 
vectors and/or expression vectors containing plant glycogenin-like protein coding sequences 
and appropriate transcriptional/translational control signals. These methods include, for 
example, in vitro recombinant DNA techniques, synthetic techniques and in vivo 
recombination/genetic recombination. See, for example, the techniques described in 
Sambrook et al., 1989, and Ausubel et al., 1989. Alternatively, KNA capable of encoding 
plant glycogenin-like protein sequences may be chemically synthesized using, for example, 
synthesizers. See, for example, the techniques described in Gait, 1984, Oligonucleotide 
Synthesis, IRE Press, Oxford. In a preferred embodiment' of the invention, the techniques 
described in Example 6, and illustrated in Figure 6 are used to construct a vector. 

A variety of host-expression vector systems may be utilized to express the plant 
glycogenin-like gene products of the invention. Such host-expression systems represent 
vehicles by which the plant glycogenin-like gene products of interest may be produced and 
subsequently recovered and/or purified from the culture or plant (using purification methods 
well known to those skilled in the art), but also represent cells which may, when transformed 
or transfected with the appropriate nucleic acid molecules, exhibit the plant glycogenin-like 
protein of the invention in situ. These include but are not limited to microorganisms such as 
bacteria (e.g., E. coli, B. subtilis) transformed with recombinant bacteriophage DNA, plasmid 
DNA or cosmid DNA expression vectors containing plant glycogenin-like protein coding 
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sequences; yeast (e.g., Saccharomyces, Pichia) transformed with recombinant yeast 
expression vectors containing the plant glycogenin-like protein coding sequences; insect cell 
systems infected with recombinant virus expression vectors (e.g., baculovirus) containing the 
plant glycogenin-like protein coding sequences; plant cell systems infected with recombinant 
virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV); 
plant cell systems transformed with recombinant plasmid expression vectors (e.g. , Ti 
plasmid) containing plant glycogenin-like protein coding sequences; or mammalian cell 
systems (e.g., COS, CHO, BHK, 293, 3T3) harboring recombinant expression constructs 
containing promoters derived from the genome of mammalian cells (e.g., metallothionein 
promoter) or from mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 
7.5K promoter; the cytomegalovirus promoter/enhancer; etc.). In a preferred embodiment of 
the invention, an expression vector comprising a plant glycogenin-like nucleic acid molecule 
operably linked to at least one suitable regulatory sequence is incorporated into a plant by one 
of the methods described in this section, section 1 .3, 1 .4 and 1 .5 or in Examples 7, 8, 9, and 
12. 

In bacterial systems, a number of expression vectors may be advantageously selected 
depending upon the use intended for the plant glycogenin-like protein being expressed. For 
example, when a large quantity of such a protein is to be produced, for the generation of 
antibodies or to screen peptide libraries, for example, vectors which direct the expression of 
high levels of fusion protein products that are readily purified may be desirable. Such vectors 
include, but are not limited, to the E. coli expression vector pUR278 (Ruther et al., 1983, 
EAdBOJ. 2:1791), in which the plant glycogenin-like coding sequence may be ligated 
individually into the vector in frame with the lac Z coding region so that a fusion protein is 
produced; pIN vectors (Inouye & Inouye, 1985, Nucleic Acids Res. 13:3101-9; Van Heeke & 
Schuster, 1989, J. Biol. Chem. 264:5503-9); and the like. pGEX vectors may also be used to 
express foreign polypeptides as fusion proteins with glutathione S-transferase (GST), In 
general, such fusion proteins are soluble and can easily be purified from lysed cells by 
adsorption to glutathione-agarose beads followed by elution in the presence of free gluta- 
thione. The pGEX vectors are designed to include thrombin or factor Xa protease cleavage 
sites so that the cloned target gene protein can be released from the GST moiety. 
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In one such embodiment of a bacterial system, full length cDNA nucleic acid 
molecules are appended with in-frame Bam HI sites at the amino terminus and Eco RI sites at 
the carboxyl terminus using standard PCR methodologies (Innis et al., 1990, supra) and 
ligated into the pGEX-2TK vector (Pharmacia, Uppsala, Sweden). The resulting cDNA 
construct contains a kinase recognition site at the amino terminus for radioactive labeling and 
glutathione S-transferase sequences at the carboxyl terminus for affinity purification (Nilsson, 
et al., 1985, EMBOJ. 4:1075; Zabeau and Stanley, 1982, EMBO J. 1: 1217). 

The recombinant constructs of the present invention may include a selectable marker 
for propagation of the construct. For example, a construct to be propagated in bacteria 
preferably contains an antibiotic resistance gene, such as one that confers resistance to 
kanamycin, tetracycline, streptomycin, or chloramphenicol. Examples of other suitable 
marker genes include antibiotic resistance genes such as those conferring resistance to G4 18 
and hygromycin {npt-11, hyg-B); herbicide resistance genes such as those conferring 
resistance to phosphinothricin and sulfonamide based herbicides ibar and sul respectively; 
EP-A-242246, EP-A- 0369637) and screenable markers such as beta-glucoronidase (GB2 
197653), luciferase and green fluorescent protein. Suitable vectors for propagating the 
construct include, but are not limited to, plasmids, cosmids, bacteriophages or viruses. 

The marker gene is preferably controlled by a second promoter which allows 
expression in cells other than the seed, thus allowing selection of cells or tissue containing the 
marker at any stage of development of the plant. Preferred second promoters are the 
promoter of nopaline synthase gene of Agrobacterium and the promoter derived from the 
gene which encodes the 35S subunit of cauliflower mosaic virus (CaMV) coat protein. 
However, any other suitable second promoter may be used. 

The nucleic acid molecule encoding a plant glycogenin-like protein may be native or 
foreign to the plant into which it is introduced. One of the effects of introducing a nucleic 
acid molecule encoding a plant glycogenin-like gene into a plant is to increase the amount of 
plant glycogenin-like protein present and therefore the amount of starch produced by 
increasing the copy number of the nucleic acid molecule. Foreign plant glycogenin-like 
nucleic acid molecules may in addition have different temporal and/or spatial specificity for 
starch synthesis compared to the native plant glycogenin-like protein of the plant, and so may 
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be useful in altering when and where or what type of starch is produced. Regulatory elements 
of the plant glycogenin-like genes may also be used in altering starch synthesis in a plant, for 
example by replacing the native regulatory elements in the plant or providing additional 
control mechanisms. The regulatory regions of the invention may confer expression of a 
plant glycogenin-like gene product in a chemically-inducible, dark-inducible, 
, developmental^ regulated, developmental-stage specific, wound-induced, environmental 
factor-regulated, organ-specific, cell-specific, tissue-specific, or constitutive manner. 
Alternatively, the expression conferred by a regulatory region may encompass more than one 
type of expression selected from the group consisting of chemically-inducible, dark- 
inducible, developmental^ regulated, developmental-stage specific, wound-induced, 
environmental factor-regulated, organ-specific, cell-specific, tissue-specific, and constitutive. 

Further, any of the nucleic acid molecules (including EST clone nucleic acid 
molecules) and/or polypeptides and proteins described herein, can be used as markers for 
qualitative trait loci in breeding programs for crop plants. To this end, the nucleic acid 
molecules, including, but not limited to, full length plant glycogenin-like genes coding 
sequences, and/or partial sequences (ESTs), can be used in hybridization and/or DNA 
amplification assays to identify the endogenous plant glycogenin-like genes, plant 
glycogenin-like gene mutant alleles and/or plant glycogenin-like gene expression products in 
cultivars as compared to wild-type plants. They can also be used as markers for linkage 
analysis of qualitative trait loci. It is also possible that the plant glycogenin-like genes may 
encode a product responsible for a qualitative trait that is desirable in a crop breeding 
program. Alternatively, the plant glycogenin-like protein and/or peptides can be used as 
diagnostic reagents in immunoassays to detect expression of the plant glycogenin-like genes 

in cultivars and wild-type plants. 

Genetically-engineered plants containing constructs comprising the plant glycogeriin- 
like nucleic acid and a reporter gene can be generated using the methods described herein for 
each plant glycogenin-like nucleic acid gene variant, to screen for loss-of-function variants 
induced by mutations, including but not limited to, deletions, point mutations, 
rearrangements, translocation, etc. The constructs can encode for fusion proteins comprising 
a plant glycogenin-like protein fused to a protein product encoded by a reporter gene. 



WO 03/014365 



PCT/GB02/03636 



27 

Alternatively, the constructs can encode for a plant glycogenin-like protein and a reporter 
gene product that are not fused. The constructs may be transformed into the homozygous 
recessive plant glycogenin-like gene mutant background, and the restorative phenotype 
examined, i.e. quantity and quality of starch, as a complementation test to confirm the 
functionality of the variants isolated. 

1.1 PLANT GLYCOGENIN-LIKE GENE PRODUCTS 
The invention encompasses the polypeptides of SEQ ID Nos: 3, 7, 11, 13, 15, 17, 19, 
21, 22. 24, 26. 2S, 30, 3 1 , 32, or 34. Plant glycogenin-like proteins, polypeptides and peptide 
fragments, variants, allelic variants, mutated, truncated or deleted forms of plant glycogenin- 
like proteins and or plant glycogenin-like fusion proteins can be prepared for a variety of 
uses, including, bui not limited to, the generation of antibodies, as reagents in assays, the 
identification of other cellular gene products involved in starch synthesis and/or starch 
synthesis initiation, etc. 

Plant glycogenin-like translational products include, but are not limited to those 
proteins and polypeptides encoded by the sequences of the plant glycogenin-like nucleic acid 
molecules of the invention. The invention encompasses proteins that are functionally 
equivalent to the plant glycogenin-like gene products of the invention. 

The primaiy use of the plant glycogenin-like gene products of the invention is to alter 
starch synthesis via increasing the number of priming or initiation sites for elongation of 
glucose chains. 

In an embodiment of the invention, an isolated polypeptide comprises the amino acid 
molecule of SEQ ID NO: 9 or a variant or fragment thereof, provided the polypeptide 
sequence is not that of SEQ ID NO: 35. 

The present invention also provides variants of the polypeptides of the invention. 
Such variants have an altered amino acid sequence which can function as either agonists 
(mimetics) or as antagonists. Variants can be generated by mutagenesis, e.g., discrete point 
mutation or truncation. An agonist can retain substantially the same, or a subset, of the 
biological activities of the naturally occurring form of the protein. An antagonist of a protein 
can inhibit one or more of the activities of the naturally occurring form of the protein by, for 
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example, deleting one or more of the receiver domains. Thus, specific biological effects can 
be elicited by addition of a variant of limited function. 

Modification of the structure of the subject polypeptides can be for such purposes as 
enhancing efficacy, stability, or post-translational modifications (e.g., to alter the 
phosphorylation pattern of the protein). Such modified peptides, when designed to retain at 
least one activity of the naturally-occurring form of the protein, or to produce specific 
; antagonists thereof, are considered functional equivalents of the polypeptides. Such modified 
' peptides can be produced, for instance, by amino acid substitution, deletion, or addition. 

For example, it is reasonable to expect that an isolated replacement of a leucine with 
an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar 
replacement of an amino acid with a structurally related amino acid (i.e. isosteric and/or 
isoelectric mutations) will not have a major effect on the biological activity of the resulting 
molecule. 

Whether a change in the amino acid sequence of a peptide results in a functional 
homolog (e.g., functional in the sense that the resulting polypeptide mimics or antagonizes 
the wild-type form) can be readily determined by assessing the ability of the variant peptide 
to produce a response in cells in a fashion similar to the wild-type protein, or competitively 
inhibit such a response. Polypeptides in which more than one replacement has taken place 
can readily be tested in the same manner. 

In a preferred embodiment, a mutant polypeptide that is a variant of a polypeptide of 
the invention can be assayed for: (1) the ability 4 to complement glycogenin function in a yeast 
or plant system in which the native glycogenin .or plant glygogenin-like genes have been 
knocked out; (2) the ability to form a complex with a glucose or oligosaccharide; or (3) the 
ability to promote initiation of elongation of polysaccharide chains. 

The invention encompasses functionally equivalent mutant plant glycogenin-like 
proteins and polypeptides. The invention also encompasses mutant plant glycogenin-like 
proteins and polypeptides that are not functionally equivalent to the gene products. Such a 
mutant plant glycogenin-like protein or polypeptide may contain one or more deletions, 
additions or substitutions of plant glycogenin-like amino acid residues within the amino acid 
sequence encoded by any one the plant glycogenin-like nucleic acid molecules described 
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above in Section 1 .1, and which result in loss of one or more functions of the plant 
glycogenin-like protein, thus producing a plant glycogenin-like gene product not functionally- 
equivalent to the wild-type plant glycogenin-like protein. 

Plant glycogenin-like proteins and polypeptides bearing mutations can be made to 
plant glycogenin-like DNA (using techniques discussed above as well as those well known to 
one of skill in the art) and the resulting mutant plant glycogenin-like proteins tested for 
activity. Mutants can be isolated that display increased function, (e.g., resulting in improved 
root formation), or decreased function (e.g., resulting in suboptimal root function). In 
particular, mutated plant glycogenin-like proteins in which any of the exons shown in SEQ 
ID NO: 1 are deleted or mutated are within the scope of the invention. Additionally, peptides 
corresponding to one or more exons of the plant glycogenin-like protein, truncated or deleted 
plant glycogenin-like protein are also within the scope of the invention. Fusion proteins in 
which the full length plant glycogenin-like protein or a plant glycogenin-like polypeptide or 
peptide fused to an unrelated protein are also within the scope of the invention and can be 
designed on the basis of the plant glycogenin-like nucleotide and plant glycogenin-like amino 
acid sequences disclosed herein. 

While the plant glycogenin-like polypeptides and peptides can be chemically 
synthesized (e.g., see Creighton, 1983, Proteins: Structures and Molecular Principles, W.H. 
Freeman & Co., NY) large polypeptides derived from plant glycogenin-like gene and the full 
length plant glycogenin-like gene may advantageously be produced by recombinant DNA 
technology using techniques well known to those skilled in the art for expressing nucleic acid 
molecules. 

Nucleotides encoding fusion proteins may include, but are not limited to, nucleotides 
encoding full length plant glycogenin-like proteins, truncated plant glycogenin-like proteins, 
or peptide fragments of plant glycogenin-like proteins fused to an unrelated protein or 
peptide, such as for example, an enzyme, fluorescent protein, or luminescent protein that can 
be used as a marker or an epitope that facilitates affinity-based purificaiton. Alternatively, 
the fusion protein can further comprise a heterologous protein such as a transit peptide or 
fluorescence protein. 

In an embodiment of the invention, the percent identity between two polypeptides of 
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the invention is at least 40%. In a preferred embodiment of the invention, the percent identity 
between two polypeptides of the invention is at least 50%. In another embodiment, the 
percent the percent identity between two polypeptides of the invention is at least 60%, 70%, 
80%, 90%, 95%, 96%, 97%, or at least 98%. Determining whether two sequences are 
substantially similar may be carried out using any methodologies known to one skilled in the 
art, preferably using computer assisted analysis as described in section 1.1* 

Further, it may be desirable to include additional DNA sequences in the protein 
expression constructs. Examples of additional DNA sequences include, but are not limited 
to, those encoding: a 3' untranslated region; a transcription termination and polyadenylation 
signal; an intron; a signal peptide (which facilitates the secretion of the protein); or a transit 
peptide (which targets the protein to a particular cellular compartment such as the nucleus, 
chloroplast, mitochondria or vacuole). The nucleic acid molecules of the invention will 
preferably comprise a nucleic acid molecule encoding a transit peptide, to ensure delivery of 
any expressed protein to the plastid. Preferably the transit peptide will be selective for 
plastids such as amyloplasts or chloroplasts, and can be native to the nucleic acid molecule of 
the invention or derived from known plastid sequences, such as those from the small subunit 
of the ribulose bisphosphate carboxylase enzyme (ssu of rubisco) .from pea, maize or 
sunflower for example. Transit peptide comprising amino acid residues 1-65 of SEQ ID NO: 
2 is an example of a transit peptide native to the polypeptide of the invention. Where an 
agonist or antagonist which modulates activity of the plant glycogenin-like protein is a 
polypeptide, the polypeptide itself must be appropriately targeted to the plastids, for example 
by the presence of plastid targeting signal at the N terminal end of the protein (Castro Silva 
Filho et al Plant Mol Biol 30 769-780 (1996) or by protein-protein interaction (Schenke PC et 
al, Plant Physiol 122 235-241 (2000) and Schenke et al PNAS 98(2) 765-770 (2001). The 
transit peptides of the invention are used to target transportation of plant glycogenin-like 
proteins as well as agonists or antagonists thereof to plastids, the sites of starch synthesis, 
thus altering the starch synthesis process and resulting starch characteristics. 

The plant glycogenin-like proteins and transit peptides associated with the plant 
glycogenin-like genes of the present invention have a number of important agricultural uses. 
The transit peptides associated with the plant glycogenin-like genes of the invention may be 
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used, for example, in transportation of desired heterologous gene products to a root, a root 
modified through evolution, tuber, stem, a stem modified through evolution, seed, and/or 
endosperm of transgenic plants transformed with such constructs. 

The invention encompasses methods of screening for agents (i.e., proteins, small 
molecules, peptides) capable of altering the activity of a plant glycogenin-like protein in a 
plant. Variants of a protein of the invention which function as either agonists (mimetics) or 
as antagonists can be identified by screening combinatorial libraries of mutants, e.g., 
truncation mutants, of the protein of the invention for agonist or antagonist activity. In one 
embodiment, a variegated library of variants is generated by combinatorial mutagenesis at the 
nucleic acid level and is encoded by a variegated gene library. A variegated library of 
variants can be produced by, for example, enzymatically ligating a mixture of synthetic 
oligonucleotides into nucleic acid molecules such that a degenerate set of potential protein 
sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion 
proteins (e.g., for phage display). There are a variety of methods which can be used to 
produce libraries of potential variants of the polypeptides of the invention from a degenerate 
oligonucleotide sequence. Methods for synthesizing degenerate oligonucleotides are known 
in the art (see, e.g., Narang, 1983, Tetrahedron 39:3; Itakura et al., 1984, Annu. i?ev. 
Biochem. 53:323; Itakura et al., 19S4, Science 198:1056; Ike et al., 1983, Nucleic Acid 
Res.UAll). 

In addition, libraries of fragments of the coding sequence of a polypeptide of the 
invention can be used to generate a variegated population of polypeptides for screening and 
subsequent selection of variants. For example, a library of coding sequence fragments can be 
generated by treating a double stranded PCR fragment of the coding sequence of interest with 
a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing 
the double stranded DNA, renaturing the DNA to form double stranded DNA which can 
include sense/antisense pairs from different nicked products, removing single stranded 
portions from reformed duplexes by treatment with SI nuclease, and ligating the resulting 
fragment library into an expression vector. By this method, an expression library can be 
derived which encodes N-terminal and internal fragments of various sizes of the protein of 
interest. 
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Several techniques are known in the art for screening gene products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. The most widely used techniques, which are amenable 
to high throughout analysis, for screening large gene libraries typically include cloning the 
gene library into replicable expression vectors, transforming appropriate cells with the 
resulting library of vectors, and expressing the combinatorial genes under conditions in which 
detection of a desired activity facilitates isolation of the vector encoding the gene whose 
product was detected. Recursive ensemble mutagenesis (REM), a technique which enhances 
the frequency of functional mutants in the libraries, can be used in combination with the 
screening assays to identify variants of a protein of the invention (Arkin and Yourvan, 1992, 
Proc. Natl. Acad. Sci. USA 59:7811-7815; Delgrave et al., 1993, Protein Engineering 
6(3):327-331). 

An isolated polypeptide of the invention, or a fragment thereof, can be used as an 
immunogen to generate antibodies using standard techniques for polyclonal and monoclonal 
antibody preparation. The full-length polypeptide or protein can be used or, alternatively, the 
invention provides antigenic peptide fragments for use as immunogens. In one embodiment, 
the antigenic peptide of a protein of the invention or fragments or immunogenic fragments of 
a protein of the invention comprise at least 8 (preferably 10, 15, 20, 30 or 35) consecutive 
amino acid residues of the amino acid sequence of SEQ ID NO: 3, 7, 9, 11, 13, 15, 17, 19, 21, 
22, 24, 26, 28, 30, 32, or 34 and encompasses an epitope of the protein such that an antibody 
raised against the peptide forms a specific immune complex with the protein. 

Exemplary amino acid sequences of the polypeptides of the invention can be used to 
generate antibodies against plant glycogenin-like genes. In one embodiment, the 
immunogenic polypeptide is conjugated to keyhole limpet hemocyanin ("KLH") and injected 
into rabbits. Rabbit IgG polyclonal antibodies can purified, for example, on a peptide affinity 
column. The antibodies can them be used to bind to and identify the polypeptides ofthe 
invention that have been extracted and separated via gel electrophoresis or other means. 

One aspect of the invention pertains to isolated plant glycogenin-like polypeptides of 
the invention, variants thereof, as well as variants suitable for use as immunogens to raise 
antibodies directed against a plant glycogenin-like polypeptide of the invention. In one 
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embodiment, the native polypeptide can be isolated, using standard protein purification 
techniques, from cells or tissues expressing a plant glycogenin-like polypeptide. In a 
preferred embodiment, plant glycogenin-like polypeptides of the invention are produced from 
expression vectors by recombinant DNA techniques. In another preferred embodiment, a 
polypeptide of the invention is synthesized chemically using standard peptide synthesis 
techniques. 

An isolated or purified protein or biologically active portion thereof is substantially 
free of cellular material or other contaminating proteins from the cell or tissue source from 
which the protein is derived, or substantially free of chemical precursors or other chemicals 
when chemically synthesized. The language "substantially free" indicates protein 
preparations in which the protein is separated from cellular components of the cells from 
which it is isolated or recombinantly produced. Thus, protein that is substantially free of 
cellular material includes protein preparations having less than 20%, 1 0%, or 5% (by dry 
weight) of a contaminating protein. Similarly, when an isolated plant glycogenin-like 
polypeptide of the invention is recombinantly produced, it is substantially free of culture 
medium. When the plant glycogenin-like polypeptide is produced by chemical synthesis, it is 
preferably substantially free of chemical precursors or other chemicals. 

Biologically active portions of a polypeptide of the invention include polypeptides 
comprising amino acid sequences identical to or derived from the amino acid sequence of the 
protein, such that the variants sequences comprise conservative substitutions or truncations 
(e.g., amino acid sequences comprising fewer amino acids than those shown in any of SEQ 
IDNOs: 3, 7, 9, 11, 13, 15, 17, 19,21,22, 24, 26, 28, 30, 32, and 34, but which maintain a 
high degree of homology to the remaining amino acid sequence). Typically, biologically 
active portions comprise a domain or motif with at least one activity of the corresponding 
protein. Domains or motifs include, but are not limited to, a biologically active portion of a 
protein of the invention can be a polypeptide which is, for example, at least 10, 25, 50, 1O0, 
200, 300, 400 or 500 amino acids in length. Polypeptides of the invention can comprise, for 
example, a glycosylation domain or site for complexing with polysaccharide or for 
attachment of disaccharide or a monomelic unit thereof, or a site that interacts with starch 
synthase and other enzymes that act on the polysaccharide. 
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1 .2 PRODUCTION OF TRANSGENIC PLANTS AND PLANT CELLS 
The invention also encompasses transgenic or genetically-engineered plants, and 
progeny thereof. As used herein, a transgenic or genetically-engineered plant referes to a 
plant and a portion of its progeny which comprises a nucleic acid molecule which is not 
, native to the initial parent plant. The introduced nucleic acid molecule may originate from 
the same species e.g., if the desired result is over-expression of the endogenous gene, or from 
a different species. A transgenic or genetically-engineered plant may be easily identified by a 
person skilled in the art by comparing the genetic material from a non-transformed plant, and 
a plant produced by a method of the present invention for example, a transgenic plant may 
comprise multiple copies of plant glycogenin-like genes, and/or foreign nucleic acid 
molecules. Transgenic plants are readily distinguishable from non-transgenic plants by 
standard techniques. For example a PCR test may be used to demonstrate the presence or 
absence of introduced genetic material. Transgenic plants may also be distinguished from 
non-transgenic plants at the DNA level by Southern blot or at the RNA level by Northern blot 
or at the protein level by western blot, by measurement of enzyme activity or by starch 
composition or properties. 

The nucleic acids of the invention may be introduced into a cell by any suitable 
means. Preferred means include use of a disarmed Ti-plasmid vector carried by 
Agrobacterium by procedures known in the art, for example as described in EP-A-01 16718 
and EP-A-0270822. AgTobacterium mediated transformation methods are now available for 
monocots, for example as described in EP 0672752 and WO00/63398. Alternatively, the 
nucleic acid may be introduced directly into plant cells using a particle gun. A further 
method would be to transform a plant protoplast, which involves first removing the cell wall 
and introducing the nucleic acid molecule and then reforming the cell wall. The transformed 
cell can then be grown into a plant. 

In an embodiment of the present invention, Agrobacterium is employed to introduce 
the gene constructs into plants. Such transformations preferably use binary Agrobacterium T- 
DNA vectors (Bevan, 1984, Nuc. Acid Res. 12:871 1-21), and the co-cultivation procedure 
(Horsch et al., 1985, Science 227:1229-31). Generally, the Agrobacterium transformation 
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system is used to engineer dicotyledonous plants (Bevan et aL, 1982, Ann. Rev. Genet. 
1 6:357-84; Rogers et aL, 1 986, Methods EnzymoL 1 1 8:627-41). The Agrobacterium 
transformation system may also be used to transform, as well as transfer, DNA to 
monocotyledonous plants and plant cells {see Hernalsteen et al., 1984, EMBO J. 3:3039-41; 
Hooykass-Van Slogteren et ah, 1984, Nature 31 1:763-4; Grimsley et al., 1987, Nature 
325:1677-79; Boulton et al., 1989, Plant MoL Biol. 12:31-40.; Gould et al., 1991, Plant 
Physiol. 95:426-34). 

Various alternative methods for introducing recombinant nucleic acid constructs into 
plants and plant cells may also be utilized. These other methods are particularly useful where 
the target is a monocotyledonous plant or plant cell. Alternative gene transfer and 
transformation methods include, but are not limited to, protoplast transformation through 
calcium-, polyethylene glycol (PEG)- or electroporation-mediated uptake of naked DNA (see 
Paszkowski et al., 1984, EMBO J. 3:2717-22; Potrykus et aL, 1985, MoL Gen. Genet. 
199:169-177; Fromm et al., 1985, Proc. Natl. Acad. Sci. USA 82:5824-8; Shimamoto, 1989, 
Nature 338:274-6), and electroporation of plant tissues (D'Halluin et aL, 1992, Plant Cell 
4:1495-1505). Additional methods for plant cell transformation include microinjection, 
silicon carbide mediated DNA uptake (Kaeppler et aL, 1990, Plant Cell Reporter 9:415-8), 
and microprojectile bombardment (Klein et aL, 1988, Proc. Natl. Acad. Sci. USA 85:4305-9; 
Gordon-Kamm et al., 1 990, Plant Cell 2:603-1 8). 

According to the present invention, desired plants and plant cells may be obtained by 
engineering the gene constructs described herein into a variety of plant cell types, including, 
but not limited to, protoplasts, tissue culture cells, tissue and organ explants, pollen, embryos 
as well as whole plants. In an embodiment of the present invention, the engineered plant 
material is selected or screened for transformants (i.e., those that have incorporated or 
integrated the introduced gene construct or constructs) following the approaches and methods 
described below. An isolated transformant may then be regenerated into a plant. 
Alternatively, the engineered plant material may be regenerated into a plant, or plantlet, 
before subjecting the derived plant, or plantlet, to selection or screening for the marker gene 
traits. Procedures for regenerating plants from plant cells, tissues or organs, either before or 
after selecting or screening for marker gene or genes, are well known to those skilled in the 
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art. 

A transformed plant cell, callus, tissue or plant may be identified and isolated by 
selecting or screening the engineered plant material for traits encoded by the marker genes 
present on the transforming DNA. For instance, selection may be performed by growing the 
engineered plant material on media containing inhibitory amounts of the antibiotic or 
, herbicide to which the transforming marker gene construct confers resistance. Further, 
.transformed plants and plant cells may also be identified by screening for the activities of any 
visible marker genes (e.g., the B-glucuronidase, luciferase, green fluorescent protein, B or CI 
anythocyanin genes) that may be present on the recombinant nucleic acid constructs of the 
present invention. Such selection and screening methodologies are well known to those 
skilled in the art. 

The present invention is applicable to all plants which produce or store starch. 
Examples of such plants are cereals such as maize, wheat, rice, sorghum, barley; fruit 
producing species such as banana, apple, tomato or pear; root crops such as cassava, potato, 
yam, beet or turnip; oilseed crops such as rapeseed, canola, sunflower, oil palm, coconut, 
linseed or groundnut; meal crops such as soya, bean or pea; and any other suitable species. 

In a preferred embodiment of the present invention, the method comprises the 
additional step of growing the plant and harvesting the starch from a plant part. In order to 
harvest the starch, it is preferred that the plant is grown until plant parts containing starch 
develop, which may then be removed. In a further prefened embodiment, the propagating 
material from the plant may be removed, for example the seeds. The plant part can be an 
organ such as a stem, root, leaf, or reproductive body. Alternatively, the plant part may be a 
modified organ such as a tuber, or the plant part is a tissue such as endosperm. 



1.3 TRANSGENIC PLANTS THAT ECTOPICALLY EXPRESS PLANT GLYCOGENIN- 

LIKE PROTEIN 

According to one aspect of the invention, a nucleic acid molecule according to the 
invention is expressed in the plant cell, plant, or part of a plant that comprises a nucleotide 
sequence encoding a plant glycogenin-like protein, fragment of variant thereof. The nucleic 
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acid molecule expressed in the plant cell can comprise a nucleotide sequence encoding a full 
length plant glycogenin-like protein. Examples of such sequences include SEQ ID NOs: 1,2, 
6, 8, 10, 12, and 14, or variants thereof and the corresponding the amino acid sequences of 
SEQ ID NOs: 3, 7, 9, 1 1, 13, and 15 or variants thereof. 

In an embodiment of the invention, the nucleic acid molecules of the invention are 
expressed in a plant cell and are transcribed only in the sense orientation. A plant that 
expresses a recombinant plant glycogenin-like nucleic acid may be engineered by 
transforming a plant cell with a nucleic acid construct comprising a regulatory region 
operably associated with a nucleic acid molecule, the sequence of which encodes a plant 
glycogenin-like protein or a fragment thereof. In plants derived from such cells, starch 
synthesis is altered in ways described in section 1 .6. The term "operably associated" is used 
herein to mean that transcription controlled by the associated regulatory region would 
produce a functional mRNA, whose translation would produce the plant glycogenin-like 
protein. Starch may be altered in particular parts of a plant, including but not limited to 
seeds, tubers, leaves, roots and stems or modifications thereof. 

In an embodiment of the invention, a plant is engineered to constitutively express a 
plant glycogenin-like protein in order to alter the starch content of the plant. In a preferred 
embodiment, the starch content is 40% ? 30%, 20%, 10%, 5%, 2% greater than that of a non- 
engineered control plant(s). In another preferred embodiment, the starch content is 40%, 
30%, 20%, 10%, 5%, 2% less than that of a non-engineered control plant(s). 

In another aspect of the invention, where the nucleic acid molecules of the invention 
are expressed in a plant cell and are transcribed only in the sense orientation, the starch 
content of the plant cell and plants derived from such a cells exhibit altered starch content. 
The altered starch content comprises an increase in the ratio of amylose to amylopectin. In 
one embodiment of the invention, the ratio of amylose to amylopectin increases by 2%, 5%, 
10%, 20%, 30%, 40%, or 50% in comparison to a non-engineered control plant(s). 

In preferred embodiment of the invention, the nucleic acid molecules of the invention 
are expressed in a potato plant and are transcribed only in the sense orientation. The starch 
content of the plant, including the tubers, exhibit increased starch content. If the number of 
copies of the nucleic acid molecules of the invention are expressed in a potato plant that are 
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transcribed only in the sense orientation is increased, the starch content of the plant, including 

the tubers, increases. 

In yet another embodiment of the present invention, it may be advantageous to 
transform a plant with a nucleic acid construct operably linking a modified or artificial 
promoter to a nucleic acid molecule having a sequence encoding a plant glycogenin-like 
protein or a fragment thereof. Such promoters typically have unique expression patterns 
, and/or expression levels not found in natural promoters because they are constructed by 
' recombining structural elements from different promoters. See, e.g., Salina et al., 1992, 
Plant Cell 4: 1 4S5-93, for examples of artificial promoters constructed from combining cis- 
regulatorv elements with a promoter core. 

In a preferred embodiment of the present invention, the associated promoter is a 
strong root and or embryo-specific plant promoter such that the plant glycogenin-like protein 
is ovcrexprcssed in the transgenic plant. 

In yet another prefen-ed embodiment of the present invention, the overexpression of 
plant glycogenin-like protein in starch producing organs and organelles may be engineered by 
increasing the copy number of the plant glycogenin-like gene. One approach to producing 
such transgenic plants is to transform with nucleic acid constructs that contain multiple 
copies of the complete plant glycogenin-like gene with native or heterolgous promoters. 
Another approach is repeatedly transform successive generations of a plant line with one or 
more copies of the complete plant glycogenin-like gene constructs. Yet another approach is 
to place a complete plant glycogenin-like gene in a nucleic acid construct containing an 
amplification-selectable marker (ASM) gene such as the glutamine synthetase or 
dihydrofolate reductase gene. Cells transformed with such constructs is subjected to 
culturing regimes that select cell lines with increased copies of complete plant glycogenin- 
like gene. See, e.g., Donn et al., 1984, J. Mol. Appl. Genet. 2:549-62, for a selection protocol 
used to isolate of a plant cell line containing amplified copies of the GS gene. Cell lines with 
amplified copies of the plant glycogenin-like gene can then be regenerated into transgenic 
plants.. 
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1.4 TRANSGENIC PLANTS THAT SUPPRESS ENDOGENOUS PLANT GLYCOGENIC 

LIKE PROTEIN EXPRESSION 

The nucleic acid molecules of the invention may also be used to augment the starch 
priming activity of a plant cell, plant, or part of a plant, or alternatively to alter activity of the 
plant glycogenin-like protein of a plant cell, plant, or part of a plant by modifying 
transcription or translation of the plant glycogenin-like gene. In an embodiment of the 
invention, an antagonist which is capable of altering the expression of a nucleic acid molecule 
of the invention is introduced into a plant in order to alter the synthesis of starch. The 
antagonist may be protein, nucleic acid, chemical antagonist, or any other suitable moiety. In 
an embodiment of the invention, an antagonist which is capable of altering the expression of 
a nucleic acid molecule of the invention is provided to alter the synthesis of starch. The 
antagonist may be protein, nucleic acid, chemical antagonist, or any other suitable moiety. 
Typically, the antagonist will function by inhibiting or enhancing transcription from the plant 
glycogenin-like gene, either by affecting regulation of the promoter or the transcription 
process; inhibiting or enhancing translation of any RNA product of the plant glycogenin-like 
gene; inhibiting or enhancing the activity of the plant glycogenin-like protein itself or 
inhibiting or enhancing the protein-protein interaction of the plant glycogenin-like protein 
and downstream enzymes of the starch biosynthesis pathway. For example, where the 
antagonist is a protein it may interfere with transcription factor binding to the plant 
glycogenin-like gene promoter, mimic the activity of a transcription factor, compete with or 
mimic the plant glycogenin-like protein, or interfere with translation of the plant glycogenin- 
like RNA, interfere with the interaction of the plant glycogenin-like protein and downstream 
enzymes. Antagonists which are nucleic acids may encode proteins described above, or may 
be transposons which interfere with expression of the plant glycogenin-like gene. 

The suppression may be engineered by transforming a plant with a nucleic acid 
construct encoding an antisense RNA or ribozyme complementary to a segment or the whole 
of plant glycogenin-like gene RNA transcript including the mature target mRNA. In another 
embodiment, plant glycogenin-like gene suppression may be engineered by transforming a 
plant cell with a nucleic acid construct encoding a ribozyme that cleaves the plant 
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glycogenin-like gene mRNA transcript. 

In another embodiment, the plant glycogenin-like mRNA transcript can be suppressed 
through the use of RNA interference, referred to herein as RNAi. RNAi allows for selective 
knock out of a target gene in a highly effective and specific manner. The RNAi technique 
involves introducing into a cell double-stranded RNA (dsRNA) which corresponds to exon. 
portions of a target gene such as an endogenous plant glycogenin-like gene. The dsRNA 
causes the rapid destruction of the target gene's messenger RNA, i.e. an endogenous plant 
! glycogenin-like gene mRNA, thus preventing the production of the plant glycogenin-like 
protein encoded by that gene. The RNAi constructs of the invention confer expression of 
dsRNA which correspond to exon portions of an endogenous plant glycogenin-like gene. 
The strands of RNA that form the dsRNA are complimentary strands from encoded by coding 
region, i.e., exons encoding sequence, on the 3' end of the plant glycogenin-like gene. 

The dsRNA has an effect on the stability of the mRNA. The mechanism of how 
dsRNA results in the loss of the targeted homologous mRNA is still not well understood 
(Cogoni and Macino, 2000, Genes Dev 10: 638-643; Guru, 2000, Nature 404, 804-808; 
Hammond et al., 2001, Nature Rev Gen 2: 110-1 19). Current theories suggest a catalytic or 
amplification process occurs that involves initiation step and an effector step. 

In the initiation step, input dsRNA is digested into 21-23 nucleotide "guide RNAs". 
These guide RNAs are also referred to as siRNAs, or short interfering RNAs. Evidence 
indicates that siRNAs are produced when a nuclease complex, which recognizes the 3" ends 
of dsRNA, cleaves dsRNA (introduced directly or via a transgene or virus) -22 nucleotides 
from the 3' end. Successive cleavage events, either by one complex or several complexes, 
degrade the RNA to 19-20 bp duplexes (siRNAs), each with 2-nucleotide 3' overhangs. 
RNase IB-type endonucleases cleave dsRNA to produce dsRNA fragments with 2-nucleotide 
y tails, thus an RNase IE-like activity appears to be involved in the RNAi mechanism. 
Because of the potency of RNAi in some organisms, it has been proposed that siRNAs are 
replicated by an RNA-dependent RNA polymerase (Hammond et al., 2001, Nature Rev Gen 
2:1 10-119; Sharp, 2001 , Genes Dev 15: 485-490). 

In the effector step, the siRNA duplexes bind to a nuclease complex to form what is 
known as the RNA-induced silencing complex, or RISC. The nuclease complex responsible 
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for digestion of niRNA may be'identical to the nuclease activity that processes input dsRNA 
to siKNAs, although its identity is currently unclear. In either case, the RISC targets the 
homologous transcript by base pairing interactions between one of the siRNA strands and the 
endogenous mRNA. It then cleaves the mKNA -12 nucleotides from the 3' terminus of the 
siRNA (Hammond et al., 2001, Nature Rev Gen 2:110-119; Sharp, 2001, Genes Dev 15: 
4S5-490). 

Methods and procedures for successful use of RNAi technology in post- 
transcriptional gene silencing in plant systems has been described by Waterhouse et al. 
(Walcrhousc ci a!., 1998, Proc Natl Acad Sci USA, 95(23): 13959-64). Methods specific to 
construction of the RNAi constructs of the invention can be found in Examples 2 and 6 as 
well as Figures o and 10. While the invention encompasses use of any plant glycogenin-like 
gene of the inv ention in the RNAi constructs, in a preferred embodiment, the strands of RNA 
that form the dsRNA are complimentary strands encoded by a coding region on the 3* end 
from nucleotide residues 1 196-1662 of SEQ IDNO:2. 

For all of the aforementioned suppression or antisense constructs, it is preferred that 
such nucleic acid constructs express specifically in organs where starch synthesis occurs (i.e. 
tubers, seeds, stems roots and leaves) and/or the plastids where starch synthesis occurs. 
Alternatively, it may be preferred to have the suppression or antisense constructs expressed 
constituti vely. Thus, constitutive promoters, such as the nopaline, CaMV 35S promoter, may 
also be used to express the suppression constructs. A most preferred promoter for these 
suppression or antisense constructs is a rice actin promoter. Alternatively, a co-suppression 
construct promoter can be one that expresses with the same tissue and developmental 
specificity as the plant glycogenin-like gene. 

In accordance with the present invention, desired plants with suppressed target gene 
expression may also be engineered by transforming a plant cell with a co-suppression 
construct. A co-suppression construct comprises a functional promoter operatively associated 
with a complete or partial plant glycogenin-like nucleic acid molecule. According to the 
present invention, it is preferred that the co-suppression construct encodes fully functional 
plant glycogenin-like gene mRNA or enzyme, although a construct encoding a an incomplete 
plant glycogenin-like gene mRNA may also be useful in effecting co-suppression. 
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In accordance with the present invention, desired plants with suppressed target gene 
expression may also be engineered by transforming a plant cell with a construct that can 
effect site-directed mutagenesis of the plant glycogenin-like gene. For discussions of nucleic 
acid constructs for effecting site-directed mutagenesis of target genes in plants see, e.g., 
Mengiste et al., 1999, Biol. Chem. 380:749-758; Offringa et al., 1990, EMBO J. 9:3077-84; 
and Kanevskii et al., 1990, Dokl. Akad. Nauk. SSSR 312:1505-7. It is preferred that such 
\ constructs effect suppression of plant glycogenin-like genes by replacing the endogenous 
plant glycogenin-like gene nucleic acid molecule through homologous recombination with 
either an inactive or deleted plant glycogenin-like protein coding nucleic acid molecule. 

In yet another embodiment, antisense technology can be used to inhibit plant 
glycogenin-like gene mRNA expression. Alternatively, the plant can be engineered, e.g., via 
targeted homologous recombination to inactive or "knock-out" expression of the plant's 
endogenous plant glycogenin-like protein. The plant can be engineered to express an 
antagonist that hybridizes to one or more regulatory elements of the gene to interfere with 
control of the gene, such as binding of transcription factors, or disrupting protein-protein 
interaction. The plant can also be engineered to express a co-suppression construct. The 
suppression technology may also be useful in down-regulating the native plant glycogenin- 
like gene of a plant where a foreign plant glycogenin-like gene has been introduced. To be 
effective in altering the activity of a plant glycogenin-like protein in a plant, it is preferred 
that the nucleic acid molecules are at least 50, preferably at least 1 00 and more preferably at 
least 150 nucleotides in length. In one aspect of the invention, the nucleic acid molecule 
expressed in the plant cell can comprise a nucleotide sequence of the invention which 
encodes a full length plant glycogenin-like protein and wherein the nucleic acid molecule has 
been transcribed only in the antisense direction. 

In a particular embodiment of the invention, a plant is engineered to express a dsRNA 
homologous to a portion of the coding region of an endogeneous PGSIP or a plant 
glycogenin-like gene transcribed in the antisense direction in order to alter the starch content 
of the plant. In a preferred embodiment, the starch content is 40%, 30%, 20%, 10%, 5% less 
than that of a non-engineered control plant(s). In a another preferred embodiment, starch is 
absent from certain plant organs or tissues in comparison to a non-engineered control 
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plant(s). In one embodiment starch content is decreased or absent in the leaves of plants 
engineered using the antisense technology described herein when compared to the starch 
content in a non-engineered control plant(s). hi other embodiments the starch content of 
tubers, or seeds is decreased or absent in plants engineered using the antisense technology 
described herein when compared to the starch content in a non-engineered control plant(s). 
Plant tissues in which starch content can be decreased using the methods of the invention 
include but are not limited to endosperm, leaf mesophyll, and root or stem cortex or pith. 

In another aspect of the invention, the nucleic acid molecules of the invention are 
expressed in a plant cell engineered expressing a dsRNA homologous to a portion of the 
coding region of an endogeneous PGSIP or using the antisense technology described herein 
and the starch content of the plant cell and plants derived from such a cells exhibit altered 
starch content. The altered starch content comprises an decrease in the ratio of amylose to 
amylopectin. In one embodiment of the invention, the ratio of amylose to amylopectin 
decreases by 10%, 20%, 30%, 40%, or 50% in comparison to a non-engineered control 
plant(s). 

In a particular embodiment, the nucleic acid molecules of the invention are expressing 
a dsRNA homologous to a portion of the coding region of an endogeneous PGSIP or using 
the antisense technology described herein, in conjunction with a developmental specific 
promoter directed towards later stages of development. In this particular embodiment, starch 
content in leaves of a plant can decrease, while starch content in other organs and tissues of a 
plant are altered in the same or different ways. 

In another particular embodiment, the nucleic acid molecules of the invention are 
expressing a dsRNA homologous to a portion of the coding region of an endogeneous PGSIP 
or using the antisense technology described herein in conjunction with a developmental 
specific promoter directed towards later stages of seed development, in cereals crops. In this 
embodiment, the ratio of small starch granules to large starch granules increases. An 
increased ratio of small to large starch granules results in greater accessibility of starch 
granules, which has certain industrial and commercial advantages related to extraction and 

processing of starch. 

The progeny of the transgenic or genetically-engineered plants of the invention 
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the nucleic acids of the invention are also encompassed by the invention. 



1.5 MODIFIED STARCH 
The invention encompasses methods of altering starch synthesis in a plant and the 

resulting modified starch produced. 

In the context of the present invention, "altering starch synthesis" means altering any 

'; aspect of starch production in the plant, from initiation by the starch primer to downstream 
aspects of starch production such as elongation, branching and storage, such that it differs 
from starch synthesis in the native plant. In the invention, this is achieved by altering the 
activity of the starch primer, which includes, but is not limited to, its function in initiating 
starch synthesis, its temporal and spatial distribution and specificity, and its interaction with 
downstream factors in the synthesis pathway. The effects of altering the activity of the starch 
primer may include, for example, increasing or decreasing the starch yield of the plant; 
increasing or decreasing the rate of starch production; altering temporal or spatial aspects: of 
starch production in the plant; altering the initiation sites of starch synthesis; changing the 
optimum conditions for starch production; and altering the type of starch produced, for 
example in terms of the ratio of its different components. For example, the endosperm of 
mature wheat and barley grains contain two major classes of starch granules: large, early 
formed "A" granules and small, later formed "B" granules. Type A starch granules in wheat 
are about 20 um diameter and type B around 5 urn in diameter (Tester, 1997, in : Starch 
Structure and Functionality, Frazier et al., eds., Royal Society of Chemistry, Cambridge, 
UK). Rice starch granules are typically less than 5 um in diameter, while potato starch 
granules can be greater than 80 um in diameter. The quality of starch in wheat and barley is 
greatly influenced by the ratio of A-granules to B-granules. Altering the activity of the starch 
primer will influence the number of granule initiation sites, which will be an important factor 
in determining the number and size of formed starch granules. The degree to which the 
starch priming activity of the plant is affected will depend at least upon the nature and of the 
nucleic acid molecule or antagonist introduced into the plant, and the amount present. By 
altering these variables, a person skilled in the art can regulate the degree to which starch 
synthesis is altered according to the desired end result. 
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The methods of the invention (i.e. engineering-a plant to express a construct 
comprising a plant glycogenin-like nucleic acid) can, in addition to altering the total quantity 
of starch, alter the fine structure of starch in several ways including but not limited to, 
altering the ratio of amylose to amylopectin, altering the length of amylose chains, altering 
the length of chains of amylopectin fractions of low molecular weight or high molecular 
weight fractions, or altering the ratio of low molecular weight or high molecular weight 
chains of amylopectin. The methods of the invention can also be utilized to alter the granule 
structure of starch, i.e. the ratio of large to small starch granules from a plant or a portion of a 
plant. The alteration in the structure of starch can in turn effect the functional characteristics 
of starch such as viscosity, elasticity, or rheological properties of the starch as measured 
using viscometric analysis. The modified starch can also be characterized by an alteration of 
more than one of the above- mentioned properties. 

In an embodiment the length of amylose chains in starch extracted from a plant 
engineered express a construct comprising a plant glycogenin-like nucleic acid is decreased 
by at least 50, 100, 150, 200, 250, or 300 glucose units in length in comparison to amylose 
from non-modified starch from a plant of the same genetic background. In another 
embodiment, the length of amylose chains in starch is increased by at least 50, 100, 150, 200, 
250, or 300 glucose units in length in comparison to amylose from non-modified starch from 
a plant of the same genetic background. 

In an embodiment of the invention, the ratio of amylose to amylopectin decreases by 
10%, 20%, 30%, 40%, or 50% in comparison to a non-engineered control plant(s). 

In a preferred embodiment, the ratio of low molecular weight chains to high 
molecular weight chains of amylopectin is altered by 1 0%, 20%, 30%, 40%, or 50% in 
comparison to a non-engineered control plant(s). 

In another preferred embodiment the average length of low molecular weight chains 
of amylopectin is altered by 5, 10, 15, 20, or 25 glucose units in length in comparison to a 
non-engineered control plant(s). In yet another preferred embodiment the average length of 
high molecular weight chains of amylopectin is altered by 10, 20, 30, 40, 50 , 60 , 70, or SO 
glucose units in length in comparison to a non-engineered control plants- 
According to one aspect of the invention, the ratio of small starch granules to large 



WO 03/014365 PCT/GB02/03636 

46 

granules is altered by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more in 
comparison to a non-engineered control plant(s). 

In another aspect, the invention provides a complex comprising plant glycogenin-like 
proteins and plant polysaccharides. The inventors believe that members of the family of plant 
glycogenin-like proteins serve as primers for biosynthesis of a range of polysaccharides in 
plants, including but not limited to starch, hemicelluloses, and cellulose. The plant 
\ polysaccharides may be either homopolysaccharides comprising only a single type of 
monomelic unit or a heteropolysaccharides comprising two or more different kinds of 
monomeric units. Accordingly, it is contemplated that plant glycogenin-like proteins form 
complexes with such polysaccharides and its subunits. Glycosylated plant glycogenin-like 
proteins are encompassed in the invention. In the broadest sense, the invention encompasses 
a complex comprising a plant glycogenin-like protein and a number of monomeric units also 
referred to as subunits of the polysaccharides. Examples of monomeric units include but are 
not limited to glucose, xylose, mannose, galactose, ribose, and rhamnose, and may be a 
hexose, or a pentose, wherein the number ranges from a single to thousands of monomeric 
units, and wherein the linkages between the subunits may vary resulting in linear and/or 
branched structures. For example, starch and precursors of starch comprise of glucose 
subunits joined by either alpha 1, 4-glycosidic bonds or alpha 1, 6-glycosidic linkages; 
cellulose and precursors of cellulose comprise glucose subunits joined by beta 1, 4-glycosidic 
bonds. The number of monomeric units ranges from 1-3, 2-5, 4-10, 8-16, 15-30, 20-40, 30- 
60, 50-100, 75-200, 100-500, or 300-800 monomeric units. Alternatively, the number of 
monomeric units ranges from 1000-5000, 5000-10,000, or 10,000-15,000 monomeric units. 
Preferably, the polysaccharide or its precursor is attached to a hydroxyl group of a tyrosine 
residue of the plant glycogenin-like protein. Without being bound by any theory or any 
mechanism, during biosynthesis, additional subunits, either singly or as oligosaccharides are 
added to the complex such that the total number of subunits increase over a period of time. 

In one embodiment, the invention encompasses complexes comprising plant 
glycogenin-like protein and starch. In a specific embodiment, the complexes of plant 
glycogenin-like protein and starch are purified. The starch molecule or its precursor 
including a single glucose subunit, can be attached to'a hydroxyl group of a tyrosine residue 
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of the plant glycogenin-like protein. In various embodiments, in a population of complexes, 
the starch molecules that are completed with the plant glycogenin-like proteins have different 
chain lengths and branching structures, for example, 1-3, 2-5, 4-10, 8-16, 15-30, 20-40, 30- 
60, 50-100, 75-200, 100-500, 200-700 glucose subunits. The polysaccharide complexed 
with the plant glycogenin-like proteins may consists of 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 
60, 70, 80, 90, 100, 1 10, 120, 130, 140, 150, 160, 170, 180, or 190 glucose subunits in length. 
In preferred embodiments of the invention, the polysaccharide is amylopectin, amylose, or a 

combination of both. 

The complexes of the invention can be used to identify sites of starch synthesis in 
stages of plant development. Briefly, the glycogenin-like protein can be labeled by means 
described herein and the complexes from tissues, cells, or organs can then be separated by 
size and compared among different stages of development. 

The embodiments described in each section above apply to the other aspects of the invention, 
mutatis mutandis. 

EXAMPLES 

EXAMPLE 1 : Identification of Plant Glycogenin-like Gene Homologues in 

Arabidopsis 

Arabidopsis nucleic acid molecules showing similarities to yeast glycogenin genes 
were identified by sequence analysis. The sequence analysis programs used in the following 
examples are from the Wisconsin Package of computer programs (Deveraux et al., Nucl. 
Acids Res. 12: 387 (1984); available from Genetics Computer Group, Madison, WI). ESTs 
and genes were identified using the program BLAST (Basic Local Alignment Search Tool; 
Altschul, S.F. et al (1990) J. Mol. Biol. 215:403-410, see also 
www.ncbi .nlm .nih. gov/BL AST/) . 

The sequence comparison and identification program tblastx was used with the yeast 
glycogenin 1 (Glgl) gene (GenBank:U25546, Swiss_Prot (SP):P36143) to search against the 
Arabidopsis sequences collected in an in-house database comprising published plant 
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sequences. A number of hits to this gene were obtained. One of the hits was identified as 
EMBL:AC004260 version GI:2957150 which was annotated as "Sequencing in progress." 
Therefore, the region showing homology to the yeast Glgl gene was extracted and a protein 
sequence was predicted using GENSCAN (a protein prediction program, Burge, C. and 
Karlin, S. (1997), LMol.Biol., http://genes.mit.edu/GENSCANinfo.html). A blastp analysis 
using this protein showed strong homology to the glycogenin genes from C.elegans (8e-22), 
! human (2e-19) and yeast (8e-06). A search in the database at NCBI at a later date showed that 
' this gene is listed as T 1 4N5 . 1 with the accession number EMBL AC004260 
(SPTREMBL:O80649) and annotated as "Unknown protein". The protein sequence is set 

forth in SEQ ID NO: 6. 

The in-house database described above was also searched with the yeast Glg2 gene 
(GB:U25436, SP:P47011) and the sequence identified above (accession EMBL:AC004260) 
using the program tblastn and tblastx. A number of further hits were identified. Out of the list 
of best hits, accession no. EMBL:AB026654, gene_id:MVE11.2 (SPTREMBL:Q9LSB 1), 
showed strong homology to the glycogenin genes from C.elegans (le-21), GYG2 human (3e- 
21) and yeast (5e-06). The genomic sequence representing this gene was extracted and is 
shown in SEQ ID NO: 1 . Further analysis by the organelle prediction programs PREDOTAR 
and/or TargetP (Emanuelsson et al, J. Mol. Biol. 300: 1005-1016 (2000)) showed that the 
protein comprises a transit peptide as shown in Table 1 below. 

Table 1. TargetP V 1.0 Prediction Results. 

Number of input sequences: 1 

Cleavage site predictions included. 

Using PLANT networks. 
Name Length cTP mTP SP Other Loc. RC TPlen 

AT3g 18660 659 0.792 0.181 0.004 0.172 C~ 2 65 
cDNA 



Performing blastp analysis using this protein against yeast sequences in an in-house 
database clearly showed sequence similarities to the yeast Glgl and Glg2 gene, were and a 
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CD-ROM containing the full genome sequence of Arabidopsis was made available. A search 
of iheArabidopsis genome sequencing project database published (Nature 408: 791, (2000)) 
showed that EMBL:AB026654 corresponded to the sequence having accession no. 
AT3gl8660. However AT3gl 8660 is reported to encode a protein of 575 amino acids 
whereas our analysis shows that this gene actually encodes a protein of 659 amino acids. A 
blastp analysis against the in-house database showed strong hits to five genes, 
EMBL:AC004260, AC000106, AC069144, AL035678 and AL035678 (corresponding to 
MIPS:atlg77130, atlg08990, atlg54940, at4g33330 and at4g33340). The sequences of these 
five genes are shown in SEQ ID NOs: 6, 8, 10, 12 and 14. The different accession numbers of 
these genes and their description in various databases are presented in Table 2. 



Table 2: 

Accession numbers of the genes in various databases: 



MIPS 


SPTREMBL 


EMBL 


GENE 


Size 


AT3gl8660 | 


Q9LSB1 


AB026654 


MVE11.2 


659 a aa 


atlg77130 


080649 


AC004260 


T14N5.1 


1201aa 


atlg08990 


0 04031 


AC000106 


F7gl9.14 


546 b aa 


atlg54940 


Q 9FZ37 


AC069144 


F14C21.47 


557aa 


at4g33330 


Q9SZB0 


AL035678 


F17M5.90 


333aa 


at4g33340 


Q9SZB1 


AL035678 


F17M5.100 


277aa 



Note . » = The AT3gl 8660 gene sequence in the MATDB (MIPS) database is reported to 

encode a 575 aa protein. The analysis performed by the inventors indicates that (exo 
2) of the AT3gl 8660 gene is missing in the MATDB (MIPS) database sequence and 
present in sequences of the AT3gl8660 gene found in other databases. 
b = The atl g08990 gene accession in the MATDB (MIPS) database is reported to 
encode a protein of 550 aa in MATDB (MIPS). The atlg08990 gene accession in 
other databases is 546aa in length. 
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Table 3: Comparison of AT3gl 8660 with other glycogenin-like genes from Arabidopsis 





% identity nucleotide 


% identity protein 


AT3glS660Xatlg77130 


68 


65 


AT3gl8660Xatlg08990 


61 ' 


•50 


' AT3glS660Xatlg54940 


*~ 61 


49 


AT3glS660X at4g33330 


60 


58 


AT3gl S660 X ai4g33340 


60 


46 



Tabic 2 shows the percentage identity between AT3gl 8660 and other glycogenin 
genes from AraNJopsis using the programme BESTFIT of the GCG package. In each case, 
the full length nucleotide and peptide was compared to the AT3gl8660 gene. 

These levels of identity are consistent with the genes encoding proteins with the same 
function. For example, the two yeast glycogenin genes are about 50% identical to one 
another at the protein level and are both known to be involved in the same pathway; both are 
essential for the production of glycogen and one can complement for the function of the 
other. 

It is interesting that the carboxyl terminal region of the protein encoded by atlg77130 
shows homology to a starch synthase (dulll) from maize. In yeast, glycogenin and glycogen 
synthase physically interact. This finding may be the first indication that a similar scenario 
exists in plants. The atlg77130 gene appears to be a duplication of the AT3gl8660 sequence, 
and the small region of homology with dulll may indicate that during the course of evolution 
this gene has become physically close to dulll . Recently published work (Yanai et al ., 
2001, Proc. Natl. Acad. Sci. USA 98(14): 7940-7945) suggests that a functional association 
between two genes can be derived from the existence of a fusion of the two as one continuous 
sequence in another genome. In yeast, it has been shown by experimentation that glycogenin 
and glycogen synthase physically interact and are associated together in an enzymatic 
complex to allow glycogen biosynthesis. The inventors believe that PGSIP interacts with 
soluble starch synthases at the start of the starch biosynthesis process. This could be the first 
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step in the formation of a biosynthetic starch enzymes complex where PGSIP acts as a 
template, starch synthases extend the chain followed by branching by starch branching 
enzymes and other starch synthesis enzymes. It is likely that biosynthesis starch enzymes 
become associated with the very first complex formed in the process of the synthesis of a 
starch polymer. 

The sequences of the six genes listed in Table 2 were compared by BLAST against 
the Arabidopsis sequences in an in-house database and a further hit was obtained. The 
identified sequence corresponding to SPTREMBL: Q8W4AZ, EMBL: AY062695 encodes a 
protein of 61 8 amino acids that showed strong homology to the glycogenin genes (4e -26). 
Further analysis of the sequence indicated that the protein represents the C terminal domain 
of the Atlg77130 gene (080649, T14N5.1) and is also annotated as Atlg77130, T14N5.1 
which encodes an unknown protein. This sequence is set forth in SEQ ID NO: 23. 

EXAMPLE 2: Isolation of cDNA Encoding^, thaliana Glycogenin Homologue 

Primers were designed to clone a full length cDNA representing the accession number 
AB026654, gene_id:MVE11.2 (at3gl 8660 (MIPS)) from an Arabidopsis thaliana cDNA 
pool. Sequencing the full length clone indicated that the gene encoded a protein of 659 
amino-acids and consists of five exons. The cDNA sequence designated as SEQ ID NO: 2. 

Arabidopsis thaliana was grown in growth cabinets with a 16 hours light and 8 hours 
dark period at a temperature of 22°C during the day and 1 7°C during the night. A mix«d 
cDNA sample was made with total RNA from 10 different tissues mixed together in equal 
amounts: root, dividing cell culture, young leaf, mature leaf, stem, seedling, seed, flower buds 
+ flowers, drought 6 days- and drought 10 days-subjected plants. 

The primer used to make the first strand cDNA using Superscript II was from the 
original paper on PCR amplification by (Frohman et al. (1988) Proc. Natl. Acad. Sci. USA, 
85:8998): 

5 ' G ACTCG AGTCG AC ATCG ATTTTTTTTTTTTTTTTT 3'. 

1 pi of this cDNA was used to amplify the cDNA clone representing the accession number 
GTD:S:1870408 (gene id.MVEl 1 .2) utilizing the primers Glgfl and Gig intl and ClaF and 
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Glgstop2. 

Gl gfl primer: 5 '-GACCATGGCAAACTCTCCCGC-3' 
Gig intl primer: 5' -GCAGCATACTTTTCCAATTAC-3' 

i 

\ CI aF primer: 5 '-GCAAGTTCCGGCT ATGGC AGC-3' 

V 

Glgstop2 primer: 5 -GCGTCACAAGTTATGGCCGGG-3' 
PCR conditions: 

Five 50 jil reaction was set up as follows: 



Composition P CR Programme 



Water 35.5pl 95°C 2 min (hot start) 

lOxbuffer 5nl 95°C 3 min 

4mMdNTPs 2.5^1 55°C 30 sec 

Pfu Turbo polymerase lul 72°C 2 min:30 sec 

4mM primers 5ul 72°C 1 0 min (extension) 

cDNA....: lul 



Two products were obtained. These were cloned in pBluescript vector (SK-) 
(Stratagene) and a full length clone was obtained. The map of this plasmid is shown in 
Figure 1. 

EXAMPLE 3: Functional Analysis of The Arabidopsis cDNA 

Yeast contains two glycogenin genes Glgl (YKROSSw) and Glg2 (YJL137c). Double 
mutants in the above genes do not make any glycogen (Cheng et al (1995) Mol. and Cell 
Biology 15(12):6632-6640). Mutant yeast strains from the EUROS CARF (European 
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S_accharomyces £erevisiae Archives For Eunctional Analysis) collection were obtained from 
SRD GmbH, D61440, Germany along with the wild type. Single mutants in the Glgl and 
Glg2 genes were obtained in addition to the double mutant. Additionally a plasmid 
containing the entire Glg2 ORF including the promoter was also obtained. This plasmid was 
used as a positive control to establish a complementation assay. The description of the strains 
are: 



Wild type 



ORF 


Accession no. 


Strain 


Genotype 




YO0O0O 


BY4741 


MATa; his3Al; 
leu2A0; metl5A0; 
ura3A0 


Single mutants: 


ORF 


Accession no. 


Strain 


Genotype 


YKR058W 


Y15129 


G1G1 mutant 


BY4742; Mat alpha; 
his3 Al;leu2A0; 
ura3A0; 

YKR05 8 w : :kanMX4 


YJL137c 


Y 17003 


glg2 mutant 


BY4742; Mat a; his3 
Al; leu2A0; ura3A0; 
YJL137c::kanMX4 



Double mutants: 



Mutant Strains 


Genotype 


1. glgl/glg2 deleted 


BY4742; Mat alpha; his3 Al leu2A0; ura3A0; 
YKR05Sw::kanMX4; YJL137c::kanMX4 
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2. glgl/glg2 deleted 



BY4742; Mat a; his3 Al; leu2A0; vira3A0; 
YKR058w::kanMX4; YJL137c::kanMX4 



Plasmid 



Plasmid name 


Gene 


Marker 


PYCG_YJL137c(pRS41 6) 


Gl g20RF+prometer 


URA3 



Glycogen defect assay 

First, it was established that the wild type and the double mutants were indeed 
different. For this experiment, freshly grown wild type, and the double mutants were picked 
up from YPD plates and the cells were suspended in 100 ul of water in an eppendorf tube. To 
this tube approximately 100 pi of glass beads (Sigma) and 10-20 pi of undiluted Lugol 
solution (Sigma) was added. The cells were vortexed briefly, spun down for few seconds and 
assayed for color development. The wild type cells stained brown whereas the double 
mutants did not stain and appeared yellow. 



Complementation assay 

Double mutants were transformed with the plasmid pRS41 6 and the transformants 
were selected on CSM/Ura- plate (Uracil drop out plate). As a negative control, double 
mutants were transformed without the plasmid. Many colonies were obtained in the positive 
plate but no colonies were obtained from the negative control indicating that the 
transformation had worked. The transformed double mutants were grown overnight in 
CSM/Ura- liquid media along with wild type and single mutants. Next day OD 600 was 
checked to ensure equal amounts of cells in each of the tubes. Approximately equal amounts 
of cells were taken in an eppendorf tube and to this equal amounts of glass bead were added 
followed by 10-20 pi of undiluted Lugol solution (Sigma). The cells were vortexed briefly 
and centrifuged for few seconds and assayed for colour development. Complementation was 
observed in the double mutants as they appeared blue similar to the single gl gl and glg2 



mutants. 
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Optimisation of the assay to distinguish wildtype and mutant strains 

A small amount of the wildtype (WT) and glycogenin double mutant <Mut) yeast 
strains were picked up from a well-grown plate, resuspended in 1ml of water, and vortexed 
briefly. The cells were diluted further in 1ml of water and 50ul of the diluted cells were 
plated on YPD plates. The plate was incubated at 30°C for two days and afterwards the plates 
were exposed to iodine vapour by inverting the plates on top of a 500ml glass beaker 
containing* iodine chips (Sigma) placed on a low heater under a fume cupboard briefly for 2-3 
minutes. Afterwards the plates were left open in the fume cupboard briefly for 1 minute and 
the colour development was monitored. The WT cells stained brown and the double mutants 
(Mut) stained pale yellow. 

Cloning PGSIP cDNA in into the pYES2 vector for complementation studies 

Two constructs were made to do the experiment, one contained the full length PGSIP 
cDNA including the transit peptide (TP) and another in which the transit peptide was 
removed (No transit peptide : NTP), these were cloned into pYes2 vector (Invitrogen). 
Primers were designed to amplify the full length PGSIP cDNA with the transit peptide 
(primers TPF and TPR) and without the transit peptide (primers NTPF and NTPR) so that 
these could be cloned into the pYes2 vector. A BamHI restriction enzyme site was 
incorporated into the forward primers (TPF and NTPR) and a Xhol restriction enzyme site 
was incorporated into the reverse primers (TPR and NTPR). The NTP forward primer 
(NTPF) was designed in such a manner so that it annealed at nucleotide position 190 of the 
full length PGSIP sequence and an ATG initiation codon was inserted after the BamHI site to 
ensure that translation into protein could occur. This resulted in a cDNA sequence lacking the 
first 63 amino acids of the PGSIP cDNA sequence which represents the transit peptide as 
predicted by the Target P program (Emanuelsson et al, J. Mol. Biol. 300:1005-1016 (2000). 
The primer sequences were as follows: 
TPF 5'-GGATCCGACCATGGCAAACTCTCCCGC-3 f 

TPR 5-CTCGAGGCGTCACAAGTTATGGCCGGG- 3' 
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KTPF 5*- GG ATCC ATGTGTTGTTGTTTCACC AAG-3 ' 
NTPR S'-CTCGAGGCGTCACAAGTTATGGCCGGG-S' 

A 50 nl PCR reaction was set up with Pfu polymerase (Stratagene) as follows: a 
coocktail solution was made with 35.5p.l water, 5pl 10X PCR buffer*-, 2.5^1 solution (20mM 
•MgCl and 4mM dNTPs), lul Pfu polymerase, S|il 4mM primers (TP/NTP), and lpl cDNA 
(1/lOOdil). The PCR thermocycler program consisted of a 95°C 3min (hot start), followed by 
30 cycles of 95°C for 30sec, 50°C for 30sec, and 72°C for 3min. The final step in the 
program held the temperature at 24°C. 

The amplified fragment was run out on an agarose gel, cut out and purified using the 
•Geneclean kit' according to the manufacturers instructions (BiolOl). The purified cDNA 
fragments were ligated into pBluescript vector (Stratagene) cut with EcoRV resttriction 
enzyme. Positive clones were identified and these were sequenced. Clones with the correct 
sequences were then cut with the restriction enzymes BamHI and Xhol and ligated in P Yes2 
vector cut with the restriction enzymes BamHI and Xhol. Positive clones were identified and 
these were named, pTPYes (Figure 2) and pNTPYes (Figure 3). In these plasmids, the cDNA 
was under the control of the yeast Gal 1 promoter that is both glucose repressible and 
galactose inducible. 

Complementation analysis with the Arabidopsis glycogenin gene 

Yeast strains were transformed with the above plasmids following the method of 
Finley and Brent, 1995, (http://cmmg.biosci.wayne.edu/finlabA^THprotocols.htm and links 
there in) in combination with the Clontech yeast transformation kit. From a freshly grown 
plate a 5ml culture of yeast strain (WT and Mut) was inoculated in YPD medium (Clontech) 
overnight with shaking at 30°C. Next day, 3ml freshly grown cells were inoculated into 
1 50ml YPD medium, (OD600=0.2) and grown shaking at 30°C for 3-4 hours (OD600=0.7). 
1 00ml cells were then transferred to two 50ml orange cap tubes and centrifuged at room 
temperature at 2000rpm for 3 minutes. The supernatant was discarded completely. The cells 
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were washed by resuspending them in 2.5ml of sterile water followed by centrifugation as 
before. The supernatant was discarded and the cells were resuspended by adding 625ul of 
Lithium Acetate (LiAc)/TE (lOmM Tris HCL pH 7.5, ImM EDTA, lOOmM LiAc; made 
from a filter-sterile stock of 1M LiAc, pH 7.5) in each tube. The cells were centrifuged as 
before and the supernatant was discarded. The cells were resuspended in 250ml of LiAc/TE 
then pooled into a single eppendorf tube giving 500ml of competent yeast cells. In an 
eppendorf tube the following was prepared, 6ml Herring Testis DNA (Clontech,10mg/ml, 
boiled earlier for 10 minutes and quenched on ice), 8ml DNA [pYes2 empty plasmid, TPYes 
and NTP Yes DNA (~2ug)] and 6ml of water making a total volume of 20ml. In another tube 
1 00ml of competent yeast cells were added to which the 20ml mixture made above, plus 
1 1ml DMSO and 600ul of 40% PEG 4000 in LiAc/TE (made from stocks of 1M LiAc pH 
7.5, filter sterile 50% PEG 4000 in water, 1M Tris HC1 pH 7.5 and 0.5M EDTA) was added. 
The tubes were inverted three to four times gently and incubated at 30°C for 30 minutes. The 
tubes were inverted again gently and heat shocked at 42°C for 20minutes after which 
50-1 00ml was directly plated on CSM/Ura-/glucose plates. The plates were incubated for two 
to three days at 30°C. Additionally, as a negative control, WT and Mut yeast strains were 
transformed with the empty P Yes2 plasmid. As a positive control the Mut strains were 
transformed with the yeast GLG2 gene (plasmid pRS416) purchased from EUROSCARF. 
The transformed cells were selected on CSM/Ura- glucose drop out plates. After two days the 
cells were picked individually into patches and streaked onto glucose and galactose plates. In 
the end, we had the following plates.(Table 4) 



Table 4 



Name 


Glucose 


Galactose 


1. WT:pTes2 control 




Tes 


2. Mut:pTes2 control 


Tes 


Tes 


3. WT:NTP 


Tes 


Tes 


4. Mut: NTP 


Tes 


Tes 


5. WT:TP 


Tes 


Tes 
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6. Mut:TP 


Tes 


Tes 


. 7. Mut yeast GLGlgene 
+ve control 


Tes 


Tes 



Yeast strains used for the complementation experiment (Table 5) 
, Table 5 



Name 



].WT:pTes2 control 



2. Mut:pTes2 control 



3. Mut:TP 



4. Mut:NTP 



5. Mut:yeastGLG2 



The plates listed in Table 4 and Table 5 were grown for two days at 30°C as described 
above. The cells were diluted and plated .on to both CSM/Ura- glucose and 
CSM/Ura-galactose plates. After two days of groAVth at 30°C the cells were exposed to iodine 
vapour as described above and photographs were taken. From the photographs, it was 
confirmed that the assay worked as the Mut strains containing the yeast GLG2 gene (no.7 
from the table 4) stained brown both in the glucose and galactose plates. The WT strain (no.l 
from the table 4) stained brown whereas the Mut strains (no. 2 from the table 4) containing 
the empty plasmid stained yellow. The cells containing the NTP plasmid (no. 4 from the table 
4) stained yellow in glucose plate but it stained brown in galactose plates but the brown 
colour is not as intense as observed in Mut strains containing the yeast GLG2 gene indicating 
that the complementation is partial. This data indicates that the PGSIP cDNA is a functional 
orthologue of the yeast glycogenin gene and plays a role in starch biosynthesis especially in 
plants and particularly in Arabidopsis. The cells containing the TP plasmid (no. 3 from the 
table 4) stains yellow in glucose and galactose plates indicating that complementation was 
not achieved with this plasmid. In general, validating the function of plant genes by yeast 
complementation has been reported (Alderson et al, Proc. Natl. Acad.Sci. USA, 88:8602- 
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8605 (1991), Vogel et al., Plant J, 13 (5):673-6S3, 1998, Blazquez, et al., Plant J, 13 (5):6S5- 
689, 1998. 

EXAMPLE 4: cDNA Isolation from Maize Endosperm 
Maize EST identification 

ESTs encoding corn glycogenin gene were identified using the program BLAST 
(Basic Local Alignment Search Tool; Altschul, S.F. et al (1990) J. Mol. Biol. 215:403-410, 
see also www.iichi.nlm.Tiih.gov/BLASm A database search using the Arabidopsis gene 
AT3gl 8660 and atlg771 30 against the maize database at NCBI identified accession no. GB: 
BF729544 and GB: BG837930 which showed significant similarity to the Arabidopsis 
glycogenin genes. The sequence of the two ESTs is shown in SEQ ID NO: 4, and SEQ ID 
NO: 5 respectively. A blastx analysis of the two ESTs against SPTREMBL database showed 
that EST BF729544 picked up the first hit to the AT3gl 8660 gene whereas EST BGS37930 
showed first hit to the atlg77130 gene. Protein alignments of these ESTs indicated that both 
ESTs were partial and they showed 85-86% identity to the above two Arabidopsis genes. 
Moreover, for EST BF729544 the identity was confined to the central portion of the 
AT3gl 8669 protein starting at amino-acid position 245 and ending at position 427, whereas 
for EST BG837930 the identity started at amino-acid position 391 and extending until 
position 632. A bestfit analysis between the two nucleotide sequences of the ESTs and the 
AT3gl 8660 gene showed that the two ESTs have 68-69% identity. A bestfit analysis 
between the two EST DNA sequences showed that there was a high degree of homology 
between the two ESTs. From the above analysis, it appears that EST BF729544 is the 
homolog of the Arabidopsis AT3gl 8660 gene, whereas EST BG837930 is a homolog of the 
Arabidopsis ATlg77130. 

A database search using the Arabidopsis genes AT3gl 8660 and atl g77130, against 
the maize database in-house identified four additional sequences which showed significant 
similarity to the Arabidopsis glycogenin genes. The four nucleotide sequences called Maize 
SEQ 1, Maize SEQ 2, Maize SEQ 3 and Maize SEQ 4 are shown in SEQ ID NOs: 27, 29, 31 
and 33 and the deduced amino acid sequences for these nucleotide sequences are shown in 
SEQ ID NOs: 28, 30, 32 and 34. 
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Culture conditions 

Maize was grown in the greenhouse with a 16 hour daylight and 8 hour night period 
with a temperature of 24°C during the day and 18°C during the night. Seeds were harvested 
at different stages between 3 and 35 days after pollination (DAP). Young and medium 
leaves were also harvested. 

Establishment of copy number and identification of glycogenin homolog in maize, wheat and 
Arabidopsis 

Genomic DNA was isolated from Arabidopsis, wheat and maize leaves according to 
the method of Davies et al. 9 ((1994) Methods in Molecular Biology vol. 28: Protocols for 
nucleic acid analysis by non-radioactive probes, Isaac P.G. (ad) pp 9-15 Humana press, 
Totowa, NJ USA). DNA was digested with restriction enzyme, EcoRI, Xhol and EcoRV and 
the digested DNA was run overnight at 20 V in 1% agarose gels. The DNA was then 
transferred to a nylon membrane by vacuum blotting and two identical southern blots were 
prepared and each one was probed first at a high stringency and later at low stringency 
conditions. One blot was probed with a digioxygenin labelled AT3gl8660 cDNA probe 
encoding the N-terminus of the gene (a l.Skb Ncol-Aval fragment) and filter 2 was probed 
with AT3gl8660 cDNA probe (PGSIP) encoding the C-tenninus of the gene (a 700bp Cla K 
fragment), Figure 5C. Hybridisation was done at 65°C and the blots were first washed with 2 
x 5 minutes with 2 x SSC, 0.1 x SDS and later with 0.1 x SSC and 0.1 x SDS at 65°C (high 
stringency washes). Strong single bands of the expected sizes (5.9kb in the Xhol cut DNA, 
4.6kb in the EcoRI cut DNA and 5.1kb in the EcoRV cut DNA) were observed only in the 
lanes containing Arabidopsis DNA. No band was observed in the lanes containing maize and 
wheat DNA, as shown in Fig. 4B. Later the blots were stripped and these were re-probed at 
55°C and washed at 60°C for 2 x 15 minutes with 2 x SSC, 0.5 x SDS (low stringency 
washes). Three bands were observed in the lane containing Xhol digested Arabidopsis DNA, 
two- three bands were observed in the lanes containing maize and wheat DNA, as shown in 
Fig. 5A and 5B. From the genomic sequence of the AT3gl 8660 gene it was known that it 

SUBSTITUTE SHEET (RULE 26) 
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spanned two Xho I, EcoRl and EcoRV sites. This demonstrated that PGSIP exists as a gene 
family comprising of about 2-3 genes in Arabidopsis, maize and wheat. 

RNA extraction and first strand cDNA synthesis 

Total RNA was extracted from the tissues described above using the method of 
Napoli et al (1990), Plant Cell, 2, 279-289 and in some cases using Qiagen RNA extraction 
kit following manufacturer s protocol. First strand cDNA was made using Superscriptll 
reverse transcriptase (GIB CO-BRIO and oligo dT primer as described in (Frohman et al, 
(1988), Proc. Natl. Acad. Sci. USA, 85:8998): 

5' GACTCGAGTCGACATCGATTTTTTTTTTTTTTTTT 3'. 

This cDNA pool was used to amplify a maize cDNA homolog to the Arabidopsis 
glycogenin gene (AT3gl8660 and atlg77130) utilising the sequence information from the 
ESTs, GB:BF729544 and GB: BGS37930 described above. 

EST BF729544 and BG837930 overlapped and these were combined to deduce a 
single maize PGSIP sequence. Primers were designed to amplify a maize cDNA clone 
corresponding to this sequence. Primer sequences were as follows. 

[GlgmaF] 5 '-GGCAATAGAGGAATTCATGTGC-3 ' 
[GlgmaR] 5 '-CGTGCAGAACTCGGACCAC AG-3 ' 

Construction of a Maize cDNA libraiy 

Total RNA was extracted from the various tissues described above (leaves and seeds 
ranging from 3-35 DAP). The RNA obtained was mixed in equal amounts. This RNA 
mixture was then used to make a maize cDNA library using SMART cDNA library 
construction kit (Clontech) following manufacturer's instruction. 
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Cloning of Maize cDNA 

lul of this first strand cDNA obtained above was used to amplify the cDNA clone 
represented by the ESTs by PCR using the primers GlgmaF and GlgmaR, the PCR product 
obtained was cloned into EcoRV cut pBlueScript (SK-) and positive clones were identified. 
These positive clones were sequenced to confirm that the product obtained indeed represented 
the sequence in the EST accession number, BF729544. This product was then used to screen 
the cDNA libraiy and a full length clone was obtained. Similarly a cDNA clone represented 
by the EST accession no. BG837930 was also cloned. 

The PCR conditions were the same as described before for cloning the Arabidopsis 
gene (AT3glS660) of SEQ ID NO: 2. 

EXAMPLE 5: cDNA Isolation From Wheat Endosperm 

A database search using the Arabidopsis genes AT3gl 8660 and atl g77130, against 
the wheat in-house database identified one sequence, which showed significant similarity to 
the Arabidopsis PGSIP genes (e-137). The sequence called Wheat SEQ1 is shown in SEQ ID 
NO: 20. 

Culture conditions 

Wheat variety NB1 (described in patent WO 00/63398) was grown in the glass house 
with a 16 hour daylight and 8 hour night period with 22°C during the day and 15°C during 
the night. Seeds were harvested at different stages between 5 and 20 days after pollination 
(DAP). Young and medium leaves were also harvested. 

RNA extraction and first strand cDNA synthesis 

Total RNA was extracted from the above tissues using the method of Napoli et al 
(1990) and in some cases using Qiagen RNA extraction kit following manufacturer's 
protocol. First strand cDNA was made using Superscriptn reverse transcriptase 
(GIBCO-BRL) and oligo dT primer as described in (Frohman et al, (19S8), Proc. Natl. Acad. 
Sci. USA, 85:899S. This cDNA pool was used to amplify a wheat cDNA homolog to the 
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Arabidopsis glycogenin gene (AT3gl8660 and atlg77130) utilising the sequence information 
from the maize ESTs, NCBI accession no. BF729544 and BG837930 described above. 

Wheat cDNA libiary making 

Total RNA was extracted from the various tissues described above (leaves and seeds 
ranging from 7-30 days post anthesis (DP A). The RNA obtained was mixed in equal 
amounts. This RNA mixture was then used to make a wheat cDNA library using SMART 
cDNA library construction kit (Clontech). Additionally a genomic library from Triticum 
tauschiU var strangulata, accession number CPI 1 10799, described in (Rahman et al., 1997, 
Genome, 40:465-474) was also used in this study. The cDNA library from Wheat cv Wyuna 
described in (Li et al., 1999, Theor. Appl. Gen. 98:226-233) was also used in this study. 

Cloning of wheat cDNA 

Because a strong band was observed on southern blots probed with the Arabidopsis 
gene (AT3gl8660),.it was assumed that there is significant degree of homology between the 
Arabidopsis, maize and wheat DNA sequences. A comparison of the Arabidopsis and the 
maize EST sequences also suggested that this was the case. A wheat cDNA library was 
screened with probes made from the maize and the Arabidopsis glycogenin gene. A full 
length clone was obtained by restriction mapping and analysing the sequence of a number of 
positive clones. 

PCR conditions 

The PCR conditions were the same as described before for cloning the Arabidopsis 
gene (AT3gl 8660). 

EXAMPLE 6: Agrobacterium Constructs 
Consti-uct making 

The pSBl 1 1 Sulugi described in patent publication WO 00/63398 was used. Six 
different constructs were made, one each for maize, wheat and Arabidopsis intense 
orientation and one each for maize, wheat and Arabidopsis in antisense orientation for 
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constitutive expression. Another six set of constructs, were also made using seed specific 
promoters. 

Two constructs were made, one for overexpression and another for downregulation of 
the Atglycogenin gene. For overexpression, the Atglycogenin gene was excised out from the 
plasmid (At3gl8660 (PGSIP), Figure 1) with Sall-EcoRI digest and ligated in Sall-EcoRI cut 
pJIT65 resulting in plasmid pCL68. This plasmid was then digested with EcoRI-XhoI and the 
fragment was ligated into Sall-Smal cut Nos-NptD SCV resulting in plasmid pCL68 SCV. In 
this plasmid the Atglycogenin is under 2x 35S promoter for constitutive expression. 

For RNAi construct, first a fragment representing the 3' end of the Atglycogenin gene 
was amplified by PCR using ClaF and Glgstop2 primer (see example 2) and was cloned into 
pBluescript. The resulting construct was designated pMC167. Clones in both orientation 
were obtained and the clone with the fragment in reverse orientation was called pMC167inv. 
pMC167inv was cut with EcoRV-Smal and ligated back resulting in plasmid pMC167del. 
pMC167del was cut with Hindlll-BamHI and ligated into Hindni-BamHI cut pT7blue2 
resulting in plasmid "GlycoinpT7Blue2" (pCL66). Another plasmid (called 
GlycogeninlRstepl, pCL67) was created by cutting pMC167inv with XhoI-EcoRV and 
ligating this fragment into XhoI-EcoRV cut pWP446A containing the AtSac25 intronl . 
Finally, plasmid "GlycoinpT7Blue2", pCL66 was cut with BamHI-SstI and the fragment 
ligated into BamHI-SstI cut "GlycogeninrRstepl'-*, pCL67 resulting in plasmid pCL69. 
pCL69 was cut with EcoRI-XhoI and the fragment was ligated in SCV Nos-Nptll at the 
Smal-Sall site resulting in plasmid pCL76 SCV. In this plasmid the At glycogenin (PGSIP) 
RNAi is under 2x35S promoter for constitutive expression. 

Figure 6 summarises the whole process and the maps of these plasmids are shown in 
Figures 9 and 10. The plasmids were transformed into the GV3101 Agrobacterium strain and 
the Arabidopsis plants were transformed. 

EXAMPLE 7: Transformation of Wheat 

Wheat plants transformed with the constructs of Example 6 Were produced by the seed 
inoculation method described in patent publication WO 00/63398. Solatium tuberosum c.v. 
Prairie was transformed with pCL68 SCV and pCL76 SCV using the method of leaf disk 
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cocultivation essentially as described by Horsch et al. (Science 227: 1229-1231, 1985). The 
youngest two fully-expanded leaves from a 5-6 week old soil grown potato plant were excised 
and surface sterilised by immersing the leaves in 8% 'Domestos' for 10 minutes. The leaves were 
then rinsed four times in sterile distilled water. Discs were cut from along the lateral vein of the 
leaves using a No. 6 cork borer. The discs were placed in a suspension of Agrobacterium 
tumefaciens strain LBA4404 containing one of the two plasmids listed above for approximately 
2 minutes. The leaf discs were removed from the suspension, blotted dry and placed on petri 
dishes ( 1 0 leaf discs/plate) containing callusing medium (Murashige and Skoog agar containing 
2.5ng.'*ml BAP. 1 ng/ml dimethylaminopurine, 3% (w/v) glucose). After 2 days the discs were 
transferred onto callusing medium containing 500jag/ml Claforan and 50fig/ml Kanamycin. After 
a further 7 days the discs were transferred (5 leaf discs/plate) to shoot regeneration medium 
consisting of Murashige and Skoog agar containing 2.5jig/ml BAP, 10 ^ig/ml GA3, 500fig/ml 
Claforan, 5<Vg/ml Kanamycin and 3% (w/v) glucose. The discs were transferred to fresh shoot 
regeneration media every 14 days until shoots appeared. The callus and shoots were excised and 
placed in liquid Murashige and Skoog medium containing 500^ig/ml Claforan and 3% (w/v) 
glucose. Rooted plants were weaned into soil and grown up under greenhouse conditions to 
provide tuber material for analysis. 

Alternatively, microtubers were produced by taking nodal pieces of tissue culture grown 
plants onto Murashige and Skoog agar containing 2.5jj.g/ml Kanamycin and 6% (w/v) sucrose. 
These were placed in the dark at 19° C for 4-6 weeks when microtubers were produced in the leaf 
axils. 

EXAMPLE 8: Transformation of Maize 

Maize plants transformed with the constructs of Example 6 were produced by the seed 
inoculation method described in patent publication WO 00/63398. 

EXAMPLE 9: Transformation of Potato 

. Transgenic potato plants expressing the Arabidopsis plant glycogenin-like gene in sense 
and antisense orientation were produced. 



WO 03/014365 



PCT/GB02/03636 



66 

EXAMPLE 1 0: Characterisation of the Transgenic Lines 

Transgenic plants were analysed by the following methods 

For sense constructs, 20 Tl lines were analysed; for antisense constructs, 50 Tl lines 
were analysed. Plants transformed with sense and antisense sequences of the invention were 
observed to have altered starch synthesizing ability which was linked to the expression of the 
transgene. 

For the maize, wheat, and potato lines examined, several techniques of analysis were 
employed. PCR-positive line identification, northern- RNA expression, southern-copy number 
detection, western-protein expression, amylogenin activity, starch structure and quality, and 
phenotype all confirmed the successful transformation of the maize, wheat, and potato. 

EXAMPLE 11: cDNA Isolation from Rice 

The six genes listed in Table 2 were blasted against the rice sequences collected in an in- 
house database and one new hit was obtained. The accession corresponded to 
SPTREMBL:Q94HG3, EMBL:AC079633 (SEQ. ID NO: 25) which encodes a protein of 614 
AA and shows strong homology to the PGSIP gene (e -1 29). 

EXAMPLE 12: Arabidopsis Transformation. 

Arabidopsis thaliana c.v. Columbia plants were transformed according to the method of 
Clough and Brent 1998 Plant J. 16(6):735-743 (1998) with slight modification. Plants were 
grown to a stage at which bolts were just emerging. Phytagar 0.1% was added to the seeds and 
these were vernalized overnight at 4°C. We used 10-15 seeds per 3x5 inch pots. Seed was added 
onto the soil with a pipette, about 4-5 seeds per ml was dispersed. Seeds were germinated as 
usual (ie under humidity pots were covered until first leaves appeared and then over a two day 
period the lid was cracked and then removed). Plants were grown for about 4 weeks in the 
greenhouse (long day condition) until bolts emerged. The first bolts were cut to encourage 
growth of multiple secondary bolts. Bolts containing many unopened flower buds were chosen 
for dipping. 
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Growing the Agrobacterium culture 

Aliquots of the Agrobacterium strain GV3101 carrying the constructs pCL68 SCV and 
pCL68 76 SCV were grown first as a 5ml culture in YEP containing Gentamycin (15ug/ml) and 
Kanamycin 20ug/ml. Next day, 2ml freshly grown culture was added to 400ml YEP media (lOg 
Yeast Extract, lOg peptone, 5g NaCl, pH 7.0) in a 2 litre flask, and the flask was incubated at 
28°C incubator with shaking overnight. Next day OD 600 of the cells was measured and found 
to be 1.8. Cells were divided into 2X Oakridge bottles and harvested by centrifugation at 
5000rpm for 10 min in a GSA rotor at room temperature The pellet was resuspended in 3 
volumes of infiltration media so that the final concentration of the culture was 0.6. Infiltration 
media was prepared by adding the following. Vi Murashige and Skoog Salts, lx Gamborg's 
Vitamins and 0.44uM Benzylamino Purine (lOul per L of a lmg/ml stock), pH was adjusted to 
5.7 with NaOH. Then 0.02% Silwet (200ul per 1L) was added and mixed into the solution. 

Arabidopsis transformation by Dipping 

500 ml of resuspended Agrobacterium was poured into a tray and plants were inverted 
into Agrobacterium solution in batches of 10 for 15 minutes. After 1 5 minutes the plants were 
lifted and the excess solution drained, The plants were transferred on their sides to a fresh tray 
containing tissue paper to allow further soaking of the solution and then transferred to 
propagating trays. The plants were immediately covered with lids to maintain humidity. After 
two days the lid was removed and the plants allowed to grow normally. They were not watered 
for one week until the soil looked dry. After flowereing was complete and the siliques on the 
plants were dry, all the seeds from one pot were harvested. The seeds were completely dried by 
keeping harvested seed in an envelope for one week 

EXAMPLE 13: Selection of transformed Arabidopsis thaliana seed. _ 

Seed produced from transformed Arabidopsis thaliana c.v. Columbia plants was weighed 
into 10 mg aliquots, equivalent to about 500 individual seed, and placed into a sterile 15 ml tube. 
The seed was surface sterilised by treating with 1 0 ml of Teepol bleach/ Tween 20 solution (500 
ml of 50% (v/v) Teepol bleach containing 1 drop of Tween 20) for five minutes. The seeds were 
then washed four times with 10ml Tween 20 in sterile water (1 drop Tween 20 in 500ml sterile 
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water). The seeds were then suspended in 5 ml sterile water and 5ml warm 0.5% agar, mixed 
carefully and then half of the seeds were spread over one petri dish containing half strength 
Murashige and Skoog agar medium and the other half over a second dish containing half strength 
Murashige and Skoog agar medium plus 50 jag/ml kanamycin. The plates were sealed and 
incubated at 4°C for 48hours. The plates were then transferred to a growth room under low light 
(2000 lux). Seed on both types of plate germinated but on the plates containing kanamycin non- 
resistant plants bleached and died within 7 days. Figure 8 demonstrates this selection of 
kanamycin resistant seedlings. After 14 days the resistant plants were transferred from the 
selective medium onto MS medium for a further 10 days before being transferred into soil. The 
plants were grown on to produce leaf material for further analysis. 

EXAMPLE 14: Analysis of Arabidopsis thaliana Plants Transfomied with pCL68 SCV 

for the Presence of the PGSIP Construct 
For the pCL68 SCV transformed lines a total of 31 kanamycin resistant plants were 
obtained from four of the original floral dips. These were tested for the presence of the construct 
byPCR. 

Genomic DNA extraction 

Leaf materia] was taken from regenerated Arabidopsis thaliana plants transformed with 
pCL68 SCV and genomic DNA isolated. One leaf was excised from a plant growing in soil and 
placed in a 1.5ml eppendorf tube. The tissue was homogenised using a micropestle and 400fil 
extraction buffer (200mM Tris HCL pH 8.0; 250mM NaCl; 25mM EDTA; 0.5% SDS) was 
added and ground again carefully to ensure thorough mixing. Samples were vortex mixed for 
approximately 5 seconds and then centrifuged at 10,000rpm for 5 minutes. A 350^1 aliquot of 
the resulting supernatant was placed in a fresh eppendorf tube and 350|il chloroform was added. 
After mixing, the sample was allowed to stand for 5 minutes. This was then centrifuged at 
10,000rpm for 5 minutes. A 300^1 aliquot of the supernatant was removed into a fresh eppendorf 
tube. To this was added 300jxl of propan-2-ol and mixed by inverting the eppendorf several 
times. The sample was allowed to stand for 10 minutes. The precipitated DNA was collected by 
centrifuging at 10 ? 000rpm for 10 minutes. The supernatant was discarded and the pellet air dried. 
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The pellet of DNA was resuspended in 50pl of distilled water and was used as a template in 
PCR. 



PCR detection ofPGSIP 

A pair of optimised oligonucleotide primers were designed and synthesised to enable the 
detection of the pCL68 SCV construct in transformed plants. The sequences of these primers 
were: 

ATGLY002: CGTCTCGTGTCTGGTTTATATTCA 
ATGLY003: TCGATGCCTGAGATCTCAGCT 

PCR mixtures which contained 5 pi 1 Ox Advantage Taq buffer; 5 pi 2mM dNTPs; 0.5 pi of 
primer ATGLY002 (lOOpM); 0.5 pi of primer ATGLY003 (lOOpM); 5 pi DNA template 
(Arabidopsis thaliana genomic DNA or control pCL68 SCV plasmid DNA); 0.25 pi Advantage 
Taq polymerase; 33.75 pi distilled water in a final volume of 50pl were set up. The PCR was 
carried out on a thermocycler using the following parameters: first a hot start at 94°C for 5 min, 
then 25 cycles consisting of 94° C for 15 sec, 55° C for 30 sec, and 72 °C for 3 min. The cycles 
were followed by 72 °C for 5 min and a final step of holding the samples at 8 °C. 

A diagnostic DNA fragment of 977 bp was produced in these reactions. 

The PCR results for pCL68 SCV transformed plants indicated that of the 30 of the 31 
of the plants examined had successfully been transformed. Thus, all of the plants except for 
the plant labeled 1-005 contained the PGSIP gene. 

EXAMPLE 15: Analysis of Arabidopsis thaliana Plants transformed with pCL76 SCV 

for the Presence of the PGSD? Downregulation Construct. 
For the pCL76 SCV transformed lines a total of 10 kanamycin resistant plants were 
obtained. Leaf material was taken from regenerated Arabidopsis thaliana plants transformed 
with pCL76 and genomic DNA isolated. One leaf was excised from a plant growing in soil 
and placed in a 1.5ml eppendorf tube. The tissue was homogenised using a micropestle and 
400pl extraction buffer (200mM Tris HCL pH S.0; 250mM NaCl; 25mM EDTA; 0.5% SDS) 
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was added and ground again carefully to ensure thorough mixing. Samples were vortex 
mixed for approximately 5 seconds and then centrifuged at 10,000rpm for 5 minutes. A 350ul 
aliquot of the resulting supernatant was placed in a fresh eppendorf tube and 350ul 
chloroform was added. After mixing, the sample was allowed to stand for 5 minutes. This 
was then centrifuged at 10,000rpm for 5 minutes. A 300pl aliquot of the supernatant was 
removed into a fresh eppendorf tube. To this was added 300pl of propan-2-ol and mixed by 
inverting the eppendorf several times. The sample was allowed to stand for 10 minutes. The 
precipitated DNA was collected by centrifuging at 10,000rpm for 10 minutes, the 
supernatant was discarded and the pellet air dried. The pellet of DNA was resuspended in 
50pl of distilled water and was used as a template in PCR. 

PCR detection ofPGSIP RNAi DNA 

A pair of optimised oligonucleotide primers were designed and synthesised to enable 
the detection of the pCL76 SCV construct in transformed plants. The sequences of these 
primers were: 

ATGLY00 1 : ._ TTTGAACAAACAAAAAGGTGGAAC 
ATGLY002: CGTCTCGTGTCTGGTTTATATTCA 

PCR mixtures which contained 5 pi 1 Ox Advantage Taq buffer; 5 pi 2mM dNTPs; 0.5 pi of 
primer ATGLY001 (lOOrnM); 0.5 pi of primer ATGLY002 (lOOmM); 5 pi DNA template 
(Arabidopsis thaliana genomic DNA or control pCL76 SCV plasmid DNA); 0.25 pi 
Advantage Taq polymerase; 33.75 pi distilled water in a final volume of 50ml were set up. 
The PCR was carried out on a thermocycler using the following parameters: first a hot start at 
94 C for 5 min, then 25 cycles of 94°C for 15 sec, 55°C for 30 sec, and 72°C for 3 min. The 
cycles are followed by 72°C for 5 min and the samples are then held at 8°C. 

A diagnostic DNA fragment of 819 bp was produced in these reactions. 
Out of 8 kanamycin resistant plants tested, 2 were shown to contain the PGSIP RNAi gene 
construct. < 
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EXAMPLE 1 6: Constitutive Overexpression and Downregulation of PGSIP Gene in 

Barley. 

Starch is made in the leaves and the grain. To test the effect of overexpressing and 
downregulating the PGSIP gene in a monocot species, plasmids pCL68 SCV (sense 
construct) and pCL76 SCV (RNAi construct) were expressed in barley. These plasmids 
conferred constitutive expression as the genes were under the control of the double 35S 
promoter. Additionally, the full length gene and the RNAi cassette were expressed under the 
control of the rice actin promoter (US patent number 56141876). For this purpose, the 
Gateway cloning technology was used according to manufacturers instruction with slight 
modification (Invitrogen). The full length PGSIP was excised from plasmid pMC168 with 
NcoI-EcoRI and cloned into pENTR4 vector cut with NcoI-EcoRI resulting in plasmid called 
pMC175. The RNAi cassette was excised from plasmid pCL76 SCV with Sall-EcoICRI and 
cloned into pENTRl vector cut with Sall-EcoRV resulting in plasmid pMC174. These 
plasmids were then recombined with Destination vector pWP492R12 SCV that contained the 
actin promoter flanked by two recombination sites (attRl and attR2 on either side 
(Invitrogen). This resulted in plasmids pMC177 and pMC176 respectively which contained 
the PGSEP gene and the RNAi construct under the control of the rice actin promoter (US 
patent number 56141876). These plasmids are shown in Figs. 9 and 10. 

The constructs were transformed into Agrobacterium strain (AGL-1) (Lazo et al., 
1991, Bio/Technol 9: 963-967) for barley transformation. Immature embryos of the barley 
variety Golden Promise were transformed essentially according to the method of Tingay et al. 
(The Plant Journal 1 1(6): 1369-1376, 1997). Donor plants of Golden Promise were grown 
with an 18 hours day, and 18/13°C. Immature embryos (1.5 - 2.0 mm) were isolated and the 
axes removed. They were then dipped into an overnight liquid culture of Agrobacterium, 
blotted and transferred to co-cultivation medium. After 2 days the embryos were transferred 
to MS based callus induction medium with Asulam and Timentin for 10 days. Tissues were 
transferred at 2 weekly intervals, and at each transfer they were cut into small pieces and 
lined out on the plate. At the third transfer, only the embryogenic tissue was moved on to 
fresh medium. After a total of 8 weeks in culture, the tissue was transferred to regeneration 
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medium (FHG), where plantlets formed within 2 - 4 weeks. These were transferred to 
Beatsons jars with growth regulator free medium until roots had formed, when they were 
transferred to Jiffies expandable teat pellets and then to the Conviron growth chambers. 

The plants were analysed by PCR using following primers. 
For plants containing pCL68 plasmid (sense expression) 
5-' ATTTGGAGAGGACAGCCCAAGC Glyc For 
- 5 CTCC ATCGTTG G ATCTCGTTCG-3 * Glyc Rev (S) 

For plants containing pCL76 plasmid (RNAi expression) 

5'-ATTTGGAGAGGACAGCCCAAGC-3' Glyc For 

5 '-GCGTCATCTTC ATCGCCAATCC - 3 ' Glyc Rev (D) 

PCR was carried out as described in above 

Results: 

Six barley plants were regenerated after transformation with plasmid pCL68 SCV and 
eight plants with plasmid pCL76 SCV. The plants were first analysed by PCR and the leaves 
of the positive plants were subjected to iodine staining by Lugol. The results of PCR analysis 
are presented in Table 7. 

Table 7. results of PCR screen of barley plants transformed with pCL68 SCV or pCL76 SCV. 



Construct 


Plant no 


PCR no. 


PCR 




Control 1 


GG11 


Neg 




Control2 


GG12 


Neg 




ControB 


GG13 


Neg 


pCL68 


1 


GG1 


Pos 


pCL6S 


2 


GG2 


Neg 
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0CL68 


3 






nfl 68 


4.1 


GG8 


Nee 


r>CT 

JJV->-LAJO 


5.1 








6 1 


GG3 


Nee 


pv^JL/Oo 


6 2 






pv^J-OO 




GG9 


Nee 




7 1 


GG10 


Nee 


pV^J^ /O 


1 1 
1.1 


GG4 


Pos 


pv^.L / O 


1 9 


GG5 

VJ VJ 


Pos 


pL,J_//0 


1 ^ 


GG6 


Pos 


pCL76 


1.4 


GG14 


ros 


pCL76 


1.5 


GG15 


Neg 


pCL76 


2 


GG7 


Neg 


pCL76 


3.1 


GG16 


Pos 


pCL76 


4.1 


GG17 


Neg 



One plant containing the sense construct was found to contain more starch granules in 
its leaves relative to control plants without the sense construct. The plants containing the 
RNAi construct were found to lack starch granules as shown in Figure 11 A. 

EXAMPLE 17: Seed Specific Overexpression and Downregulation of the PGSIP Gene 

in Barley 

For seed specific expression, the plasmids pMC174 and pMC175 were recombined 
with the plasmid pWP491R12SCV that contained the seed specific promoter flanked by two 
recombination sites (attRl and attR2 on either side (Invitrogen)). Barley plants were 
transformed according to the method of Tingay et al. (1997) with some modification as 
described for Example 13. 
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EXAMPLE 1 8: Analysis of Transformed Solatium tuberosum Plants for Presence of 

the PGSIP Construct 
Analysis of regenerated Potato transformants. 

Leaf material was taken from regenerated potato plants and genomic DNA isolated. 
One large potato leaf (approximately 30mg) was excised from an in vitro grown plant and 
placed in a 1 .5ml eppendorf tube. The tissue was homogenised using a micropestle and 400pl 
extraction buffer (200mM Tris HCL pH 8.0; 250mM NaCl; 25mM EDTA; 0.5% SDS) was 
added and ground again carefully to ensure thorough mixing. Samples were vortex mixed for 
approximately 5 seconds and then centrifuged at 10,000rpm for 5 minutes. A 350pl aliquot of 
the resulting supernatant was placed in a fresh eppendorf tube and 350pil chloroform was 
added. After mixing, the sample was allowed to stand for 5 minutes. This was then 
centrifuged at 1 0.OOOrpm for 5 minutes. A 300ul aliquot of the supernatant was removed into 
a fresh eppendorf tube. To this was added 300pl of propan-2-ol and mixed by inverting the 
eppendorf several times. The sample was allowed to stand for 10 minutes. The precipitated 
DNA was collected by centrifuging at 10,000rpm for 10 minutes. The supernatant was 
discarded and the pellet air dried. The pellet of DNA was resuspended in 50ul of distilled 
_water and was used as-a-template -in PGR. 

PCR mixtures which contained 5 pi 1 Ox Advantage Taq buffer; 5 ul 2mM dNTPs; 0.5 
u 1 of either primer ATGLY001 or ATGLY003 (lOOuM); 0.5 pi of primer ATGLY002 
(100pM); 5 pi DNA template (Solatium tuberosum c.v. Prairie genomic DNA, control pCL68 
SCV plasmid DNA or control P CL76 SCV plasmid DNA); 0.25 pi Advantage Taq 
polymerase; 33.75 pi distilled water in a final volume of 50pl were set up. The PCR was 
carried out on a thermocycler using the following parameters: first a hot start at 94°C for 5 
min, followed by 25 cycles of 94° C for 15 sec, 55° C for 30 sec, and 72 °C for 3 min. The 
cycles were followed by 72 ° C for 5 min and a finally holding the temperature at 8 ° C. 

A diagnostic DNA fragment of 977 bp was produced in these reactions from plasmid 
pCL6S SCV or 819 bp from plasmid pCL76 SCV. Lines of Solatium tuberosum c.v. Prairie 
transformed with pCL68 SCV or pCL76 SCV were tested by PCR and were shown to contain 
the construct. 
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Of 1 8 plants transformed with pCL68 SCV, all 1 8 contained the sense PGSIP construct. For the 
PGSIP KNAi construct (pCL76 SCV), 3 out of 8 plants contained the construct. 

EXAMPLE 19: Analysis of Transformed Plants for PGSIP Expression. 

Raising antisera to PGSIP proteins. 

Expression of PGSEP proteins can be analysed by Western blotting. Antibodies to PGSIP 
are raised by inoculating rabbits with peptides corresponding to the Arabidopsis thaliana PGSEP 
protein sequences produced by expressing the sequence as a transcriptional fusion with 
glutathione-S-transferase in E. coli cells 

Preparation of protein extracts. 

Protein extracts from potato tuber were produced by taking up to lOOmg of tissue and 
homogenising in 1ml of ice cold extraction buffer consisting of 50mM HEPES pH 7.5, 10mM 
EDTA, 1 OmM DTT. Additionally, protease inhibitors, such as PMSF or pepstatin were included 
to limit the rate of protein degradation. The extract was centrifuged at 13000 rpm for 1 minute 
and the supernatant decanted into a fresh eppendorf tube and stored on ice. The supernatants was 
assayed for soluble protein content using, for example, the BioRad dye-binding protein assay 
(Bradford, M.C. (1976) Anal. Biochem. 72, 248-254). 

An aliquot of the soluble protein sample, containing between 10-50jag total protein was 
placed in an eppendorf tube and excess acetone (ca 1.5ml) added to precipitate the proteins which 
were collected by centrifuging the sample at 13000 rpm for 5 minutes. The acetone was decanted 
and the samples air-dried until all the residual acetone has evaporated. 

SDS-polyacrylamide gel electrophoresis. 

The protein samples were separated by SDS-PAGE. SDS PAGE loading buffer (2% 
(w/v) SDS; 12% (w/v) glycerol; 50 niM Tris-HCl pH 8.5; 5 mM DTT; 0.01% Serva blue G250) 
was added to the protein samples (up to 50 1). Samples were heated at 70°C for 10 minutes 
before loading onto a NuPage polyacrylamide gel. The electrophoresis conditions were 200 V 
constant for 1 hour on a 10% Bis-Tris precast polyacrylamide gel, using 50 mM MOPS, 50 mM 
Tris, 1 mM EDTA, 3.5 mM SDS, pH 7.7 running buffer, according to the NuPage methods 
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(Invitrogen, US 5,578,1 80). 
Electroblotting. 

Separated proteins were transferred from the acrylaniide gel onto PVDF membrane by 
electroblotting (Transfer buffer: 20% methanol; 25 mM Bicine pH 7.2; 25 mM Bis-Tris, 1 mM 
EDTA, 50 _M chlorobutanol) in a Novex blotting apparatus at 30 V for 1.5 hours. 

Immunodetection. 

After blocking the membrane with 5% milk powder in Tris buffered saline (TBS-Tween) 
(20mM Tris, pH 7.6; 140mM NaCl; 0.1% (v/v) Tween-20), the membrane was challenged with 
a rabbit anti-PGSIP antiserum at a suitable dilution in TBS-Tween. Specific cross-reacting 
proteins were detected using an anti-rabbit IgG-Horse radish peroxidase conjugate secondary 
antibody and visualised using the enhanced chemiluminescence (ECL) reaction (Amersham 
Pharmacia). 

Detection of mRNA. 

Expression of PGSIP mRNA was analysed in plants by rtPCR or by Northern blotting. 

EXAMPLE 20: Analysis of Leaf Starch Content 

Samples of leaves from control and transformed Arabidopsis thaliana plants which had 
been grown for 24 hours under high light (about 60 mg) were taken in a microfuge tube and 
extracted with 100 \x\ of 45% HC10 4 . This suspension was diluted with 1 ml of distilled water 
and centrifuged (14000 rpm, 2 min.) Aliquots of the extracts were then analysed for starch 
content by taking 1 00 \i\ of the extract and mixing with an equal volume of Lugol's solution, the 
optical density of which was then measured at 540nm using a microplate reader. Standard starch 
mixtures were prepared in the same way and measured at the same time and the starch content 
of the extracts was calculated by reference to these standards. 

Table 8. Starch contents of leaves of Arabidopsis thaliana plants transformed with pCL68 SCV 
(sense construct comprising SEQ ID NO: 1) compared with the starch contents of leaves of non 
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for three plants. 



samples leaf starch content ug/g fresh 



weight (FWt). 


37256 


19.95 


1-002 


12.68 


1-003 


49.68 


1-004 


4S.02 


1-005 


13.S8 


37407 


17.47 


37437 


49.55 


3746S 


24.S8 


37499 


8.65 


37529 


17.71 


37560 


15.93 


37590 


9.95 


37621 


6.02 


37257 


21.9 


379RR 

J /Zoo 




37316 


11.82 


37261 


22.85 


37381 


9.51 


37412 


13.21 


37442 


33.60 


37473 


17.96 


37504 


8.88 


37534 


18.58 


37565 


11.98 
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37295 32.83 
37323 38.43 
37354 16.16 

ncc 22.59 (±5.08) 



The ncc value represents the mean and standard error for the three control plants. Each 
data point otherwise represents a single leaf from an individual plant. Taking the error of the 
control as a measure of the population variation, then plants 1-003, 1-004, 1-007, 1-008, 6-007 
and 9-003 have significantly more starch in their leaves than the controls. Plants 1-009, 1-012, 
1-013, 2-003, 6-005, 6-009 and 6-01 1 have significantly lower starch contents. Hie copy number 
and level of expression of the sense construct in the plants are to be determined. The results 
demonstrate that a sense construct comprising SEQ ID NO: 1 can effectively alter the content 
of starch. 

Table 9. Starch contents of leaves of Arabidopsis thaliana plants transformed with pCL76 SCV 
(RNAi construct) compared to controls. 



Samples 




starch content 






\ig per leaf 


pCL76 SCV 


7 


27.20 


pCL76 SCV 


20.1 


26.96 


Control 


ncc 


42.97 



The data in these tables shows that the leaves of the transformed plants have an altered starch 
content compared to the untransformed controls (ncc). . 
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EXAMPLE 21 : Microscopic Analysis of Starch Granule Size and Number. 

Starch granules were extracted from Arabidopsis thaliana or Solanum tuberosum 
tissue by taking 50-1 00 mg of tissue and homogenising in 1 % sodium metabisulphite 
solution. After filtering the extract through miracloth, the starch was collected by 
centrifugation, 1 300rpm for 5 minutes and then resuspended in 1 ml of water. Aliquots were 
taken and an equal amount of Lugol solution added to enhance the contrast of the starch 
granules. Suspensions were prepared for microscope imaging by placing onto a microscope 
slide. Representative micrographs were taken of the samples. The electronically captured 
images were then processed using suitable image analysis software, such as the package 
•Imager. This enabled a quantification of the size distributions of different starch samples to 

be made and compared. 

Alternatively, samples of purified starch are either suspended in water and viewed 
with a light microscope or sputter -coated with gold and viewed with a scanning electron 
microscope such as a Phillips (Eindhoven, The Netherlands) XL30 Field Emission Gun 
scanning electron microscope at 3kV. 

Starch granules can be examined in tissues as well. For example, starch in tissues is 
stained using Lugol's solution (1% Lugol's solution, I-KI [1:2, v/v]; Merck). Starch can then 
be examined, for example, in longitudinal sections of tubers. Alternatively the starch can be 
further isolated subsequent to staining and suspended in water, and stained again with a few 
drops of Lungol's solution and examined microscopically. 

The radii of the blue staining core of the starch granules and the total granule are 
measured microscopically using an ocular micrometer. If granules are ovoid in shape, both 
long radius and short radius measurements are taken. The radii of the blue-staining core and 
the total granule are determined by measuring individual, randomly chosen starch granules. 

EXAMPLE 22 : Analysis of Starch Functionality. 

Preparation of starch. 

Starch was extracted from potato tubers by taking 0.5-1 kg of washed tuber tissue and 
homogenising using a juicerator chased with 200ml of 1% Sodium bisulphite solution. The 
starch was allowed to settle, the supernatant decanted off and the starch washed by 
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resuspending in 200 ml of ice-cold water. The resulting starch pellet was left to air dry. Once 
dried the starch was stored at -20 C. 

Alternatively, other methods can be utilized to isolate starch, for example^ samples of 
tubers are first homogenized in extraction buffer (10 mM EDTA, 50 mM Tris, pH 7.5, ImM 
DTT, 0.1% Na2S205). The resulting fibrous substance is then washed several times with the 
extraction buffer and filtered. The nitrate is allowed to set at 4 °C and the, supernatant is 
discarded after the starch granules have settled. Starch granules are then washed with 
extraction buffer, water, and acetone and dried at 4 °C. 

With maize and other cereal crops, seeds are soaked in 50ml of a 20 mM sodium 
acetate, pH 6.5, 1 0 mM mercuric chloride solution. After 24 hr, the germ and pericarp are 
removed and 50 ml of fresh solution is added for an additional 24 hr. Endosperm is 
repeatedly homogenized for 1 minute intervals in a mortar and pestle, and freed starch 
granules are purified by multiple extractions with saline and toluene (Boyer et al., 1976, 
Cereal Chemistry 53: 327-337). Granular starch is washed three times with double distilled 
water, once with acetone, and dried at 40 °C. 

Viscometric analysis of starch. 

Starch samples were analysed for functionality by testing rheological properties using 
viscometric analysis (rapid visco analyzer (RVA) or differential scanning calorimetry 
(DSC)). Viscosity of starches can also be measured by various other techniques. For 
example, a Rapid Visco Analyser Series 4 instrument (Newport Scientific, Sydney Australia) 
can be utilized with a 1 3 min profile where 2 g of starch are analyzed in water at a 
concentration of 7.4% (w/v) and the analysis used the stirring and heating protocol that 
suggested by Newport Scientific. For longer profiles, 2.5 g starch samples are used at a 
concentration of 1 0% (w/v). The sample is heated while stirring at 1 .5 °C min" 1 from 50 °C 
to 95 °C for 1 5 min then cooled to 50 °C at 1 5 min"'. Viscosity is measured in centipose (cP). 
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EXAMPLE 23 : Analysis of Fine Structure of Starch 

Amylopectin chain length distribution 

One method for examining the fine structure of starch is ,4 C labeling of amylopectin 
chains to determine chain lengths. Extracted starch granules are suspended at 25 mg ml ~* in 
medium comprising 100 mM Bicine (pH 8.50, 25 mM potassium acetate, 10 mM DTT, 5 
mM EDTA, 1 mM ADP[U- ,4 C] glucose at 18.5 GBq mol" 1 and 10 pi starch suspension in a 
total volume of 1 00 pi, for each sample. Samples are then incubated for 1 hour at 25 °C. The 
incubation is terminated by addition of 3 ml 750 ml" 1 aqueous methanol containing 10 g 1-1 
KCL (methanol/KCL). After incubation for at least 5 minutes at room temperature, starch is 
collected by centrifugation at 2000 g for 5 min. The supernatant is disgarded and the pellet is 
resuspended in 0.3 ml distilled water. The Methanol/KCL wash, centrifugation, and 
resuspension are repeated 2-4 times. The resulting pellets are dried at room temperature, 
dissolved with 50 ^1 1M NaOH, and diluted with 50 \\\ distilled water. To determine the 
average length of amylopectin chains into which 14 C was incorporated, products of incubation 
with ADP[U- I4 C] glucose are debranched with isoamylase and subjected to chromatography 
on a column of Sepharose CL-4B. The glucan eluding earlier from the column consists of 
longer chains than glucan eluding later from the column. 

Another method for examining the fine structure of starch is chromatography without 
labeling. A 10 mg sample of isolated starch is dissolved in 100 ul 0.1 M NaOH for 1 hour at 
95 °C. The sample is diluted in 900 pi water, 150 pi 1 M soduim citrate (pH 5.0). The starch 
is then debranched by adding 300 units of isoamylase, or hydrolysed with 300 units of alpha- 
amylase, or beta-amylase for 24 hours at 37 °C. A 100 ul aliquot sample of the hydrolysed 
samples is analyzed with chromatography. For example HPAE-PAD chromatography (Carbo 
PAC PA- 100 column; Dionex, Idstein, Germany; flow 1 ml min " 1 ; buffer A: 150 mM NaOH; 
buffer B: 1 M sodium acetate in buffer A) with an applied gradient comprising 0-5 min 100% 
A; 5-20 min 85% A, 15% B, 20-35 min 70% A, 30% B (linear); 35-80 min 50% A, 50% B 
(convex). 

Alternatively, HPLC chromatography is utilized, where partially hydrolyzed 
debranched starch samples in 0.01 N NaOH (5 mg/ml), and 2 ml are applied to a size 
exclusion column (Sephadex G-75, 1 .5 X 100cm). The mobile phase is 0.01 N NaOH and 
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the flow rate is 0.6-0.9 ml/min. Samples are analyzed for total carbohydrate by the phenol- 
sulfuric acid test (Hodge and Hofreiter, 1962/Vol. 1, R.L. Whistler and ML Wolform (Eds.), 
Corporation. Version 7. Academic Press, New York, pp: 388-389) and the Park Johnson test 
for reduced ends (Porro et al., 1981, Anal Biochem. 1 18(2):301-6). Based on these to 
analyses the average chain length for each fraction is calculated. 

Amylopectin is further characterized by measuring the low molecular weight to high 
molecular weight chain ratio (on a weight basis) according to the method of Hizukuri 
(Hizukuri, 1986, Carbohydrate Research, 147, 342-347). 

An alternative method for analyzing amylopectin chains is gel electrophoresis. Starch 
samples are debranched with isoamylase, derivatised with fluorophore APTS, and subjected 
to gel electrophoresis in an Applied Biosystem DNA sequencer. Data are analized by 
Genescan software. The method allows for identification of authentic maltohexaose and 
maltoheptaose as well as a determination of percent molar differences and the degree of 
polymerization, distribution of chain lengths, between samples. 

Amylose content of starch 

Amylose percentages are determined by gel permeation chromatography according to 
Denyer et al. (Denyer et al., 1995, Plant Cell Environ 18:1019-1026) or by gel filtration 
analysis according to Boyer and Liu (Boyer and Liu, 1985, Starch Starke 37:73-79). 

Alternatively, the amylose contents are determined spectrophoiometrically in 1 to 2 
mg isolated starch according to the iodometric method described by Hovenkamp-Hermelink 
et al. 1988. Amperometric titrations are performed according to Williams et al 1970 to 
determine the average amylose content per sample. 

EXAMPLE 24: cDNA Isolation From Barley 

A database search using the Arabidops is genes AT3gl8660 and atlg77130, against an 
in-house database identified two barley sequences. The accessions corresponding to 
Genbank: BE438665 and Genbank: BE438754 showed significant similarity to the 
Arabidopsis PGSIP genes (9e-34). The sequences called Barley SEQ1 and Barley SEQ2 are 
shown in SEQ LD Nos: 16 and IS. 
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All publications, patents and patent applications mentioned in this specification are 
herein incorporated by reference into the specification to the same extent as if each individual 
publication, patent or patent application was specifically and individually indicated to be 
incorporated herein by reference 
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<^T -AIMS 



1. An isolated nucleic acid molecule that: 

(i) comprises a nucleotide sequence which encodes a polypeptide comprising the 
amino acid sequence of SEQ ID NO: 3, or a fragment thereof; 

(ii) comprises a nucleotide sequence at least 40% identical to SEQ ID NOs: 1 or 
2, or a complement thereof; or 

(iii) hybridizes to a nucleic acid molecule consisting of SEQ ID NOs: 1 or 2 under 
low stringency conditions of hybridization, or a complement thereof. 

2. The isolated nucleic acid molecule of claim 1 , wherein the nucleic acid 
molecule comprises SEQ ID NOs: 1 or 2 or a complement thereof. 

3. The isolated nucleic acid molecule of claim 1, comprising a nucleotide 
sequence selected from the group consisting of nucleotide residues 516-592, 681 to 918, 1039 
to 1655, 1762 to 2536, and 2991 to 3264 of SEQ ID NO: 1. 

4. An isolated nucleic acid molecule that: 

(i) comprises a nucleotide sequence which encodes a polypeptide comprising the 
amino acid sequence of SEQ ID NO: 1 1, or a fragment thereof; 

(ii) comprises a nucleotide sequence at least 70% identical to SEQ ID NO: 1 0, or 
a complement thereof, wherein the nucleotide sequence does not encode the 
amino acid of SEQ ID NO: 35; or 

(iii) hybridizes to a nucleic acid molecule consisting of SEQ ID NO: 10 under 
stringent conditions of hybridization, or a complement thereof, wherein the 
nucleotide sequence does not encode the amino acid of SEQ ID NO: 35. 

5. The isolated nucleic acid molecule of claim 4, wherein the nucleic acid 
molecule comprises SEQ ID NO: 10 or a complement thereof. 
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6. An isolated nucleic acid molecule which encodes a polypeptide comprising the 
amino acid sequence that is at least 98% identical to SEQ ID NO: 9. 

7. An isolated nucleic acid molecule thereof comprising the nucleotide sequence 
of SEQ ID NO: 8 or a complement thereof. 

8. An isolated nucleic acid molecule that: 

(i) comprises a nucleotide sequence which encodes a polypeptide comprising the 
amino acid sequence of SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 
32, 34, or a fragment thereof; 

(ii) comprises a nucleotide sequence at least 70% identical to SEQ ID NOs: 4, 5, 
6, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, 33, or a complement thereof; or 

(iii) hybridizes to a nucleic acid molecule consisting of SEQ ID NOs: 4, 5, 6, 12, 
14, 16, 18, 20, 23, 25, 27, 29, 31, 33 under stringent conditions of 
hybridization, or a complement thereof. 

9. The isolated nucleic acid molecule of claim 8, wherein the nucleic acid 
molecule comprises SEQ ID NOs: 4, 5, 6, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, 33, or a 
complement thereof 

10. A fragment of the isolated nucleic acid molecule of any one of claims 1-9, 
wherein the fragment comprises at least 40, 60, 80, 100 or 150 contiguous nucleotides of the 
nucleic acid molecule. 

1 1 . The isolated nucleic acid molecule of claim 1 comprising the nucleotide 
sequence of nucleotides 1-195 of SEQ ID NO: 2, or a complement thereof. 

12. An isolated polypeptide comprising the amino acid sequence of amino acid 
residues 1-65 of SEQ ID NO: 3, or a fragment thereof. 
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13. An isolated polypeptide comprising: 

(i) an amino acid sequence that is at least 70% identical to SEQ ID NO: 3 
or a fragment thereof; 

(ii) an amino acid sequence encoded by the nucleic acid molecule of claim 

1; or 

(iii) an amino acid sequence of SEQ ID NO: 3. 

1 4. An isolated polypeptide comprising: 

(i) an amino acid sequence at least 70% identical to SEQ ID NO: 1 1 , or a 
fragment thereof; 

(ii) an amino acid sequence encoded by the nucleic acid molecule of claim 

4; or 

(iii) an amino acid sequence of SEQ ID NO: 1 1 . 

15. An isolated polypeptide comprising: 

(i) an amino acid sequence that is at least 98% identical to SEQ ID NO: 

9; 

(iii) an amino acid sequence encoded by the nucleic acid molecule of SEQ 

ID NO: 8, or a complement thereof; or 
(v) an amino acid sequence of SEQ ID NO: 9, or a fragment thereof. 

16. An isolated polypeptide comprising: 

(i) an amino acid sequence that is at least 70% identical to SEQ ID NOs: 
7, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34, or a fragment thereof; 

(ii) an amino acid sequence encoded by the nucleic acid molecule of claim 

8; 

(iii) an amino acid sequence of SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24, 
26,28,30,32,34. 
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17. A fragment of a polypeptide comprising at least 5 amino acid residues, 
wherein said fragment is a portion of the polypeptide encoded by a nucleic acid molecule 
selected from the group consisting of exon I, exon II, exon III, exon IV and exon V of SEQ 
ID NO: 1. 

18. A polypeptide comprising the amino acid sequence of SEQ ID: 3, 7, 9, 11, 13, 
15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34 which further comprising one or more 
conservative amino acid substitution. 

1 9. A fusion protein comprising the amino acid sequence of any one of claims 12- 
1 8 and a heterologous polypeptide. 

20. A fragment or immunogenic fragment of a polypeptide of any one of claims 
12-1 8, wherein the fragment comprises at least 5, S, 10, 15, 20, 25, 30 or 35 consecutive 
amino acids of the polypeptide. 

21. An antibody that immunospecifically binds to a polypeptide of any one of the 
claims 12-18. 

22. A method for making a polypeptide of any one of the claims 12-1 8, 
comprising the steps of: 

(a) culturing a cell comprising a recombinant polynucleotide encoding the 
polypeptide of any one of claims 12-18 under conditions that allow 
said polypeptide to be expressed by said cell; and 

(b) recovering the expressed polypeptide. 

23. A complex comprising a polypeptide encoded by a nucleic acid molecule of 
any of claims 1-9 and a starch molecule. 
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24. The complex of claim 23, wherein the starch molecule comprises from 1 to 
700 glucose units. 

25. The complex of claim 23, wherein the starch molecule comprises branching 
chains of glucose polysaccharides. 

26. A vector comprising a nucleic acid molecule of any one of claims 1 -9. 

; 
t 

27. An expression vector comprising a nucleic acid molecule of any one of claims 
1 -9 and ai least one regulatory region operably linked to the nucleic acid molecule. 

2S. The expression vector of claim 27, wherein the regulatory region confers 
chemically-induciblc. dark-inducible, developmentally regulated, developmental-stage 
specific, wound-induced, environmental factor-regulated, organ-specific, cell-specific, and/or 
tissue-specific expression of the nucleic acid molecule or constitutive expression of the 
nucleic acid molecule. 

29. The expression vector of claim 27, wherein the regulatory region is selected 
from the group consisting of a 35S CaMV promoter, a rice actin promoter, a patatin 
promoter, and a high molecular weight glutenin gene of wheat. 

30. An expression vector comprising the antisense sequence of a nucleic acid 
molecules of any one of claims 1-9, wherein the antisense sequence is operably linked to at 
least one regulatory region. 

31 . A genetically-engineered cell which comprises a nucleic acid molecule of any 
one of claims 1-9. 



32. A cell comprising the expression vector of claim 27. 
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33. A cell comprising the expression vector of claim 30. 

34. A genetically-engineered plant comprising the isolated nucleic acid molecule 
of any of claims 1-9. 

35. The genetically-engineered plant of claim 34 and progeny thereof, further 
comprising a transgene encoding an antisense nucleotide sequence. 

36. The genetically-engineered plant of claim 3 1 , further comprising an RNA 
interference construct. 

37. A cell comprising an a 35SCaMV constitutive promoter operably linked to a 
nucleic acid molecule of SEQ ID NO:2 or a rice actin promoter operably 
linked to an RNA interference construct comprising fragments of a nucleic 
acid molecule of SEQ ID NO:2, wherein said promoter confers expression of 
said fragments. 

38. A method of altering starch synthesis in a plant comprising introducing into a 
plant: 

(i) a nucleic acid sequence comprising a starch primer gene, or a fragment 
thereof; 

(ii) a nucleotide sequence that hybridises under stringent conditions to a 
sequence of (i) or its complement; or 

(iii) an agent which is capable of altering the expression of a sequence of (i) 
or (ii); 

such that starch synthesis is altered relative to a plant without any of the above 
sequences. 
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39. A method of altering starch synthesis in a plant comprising, introducing into a 
plant an expression vector of claim 27, such that starch synthesis is altered relative to a plant 
without the expression vector. 

40. A method of altering starch synthesis in a plant comprising, introducing into a 
plant at least an expression vector of claim 30, such that starch synthesis is altered in 
comparison to a plant without the expression vector. 

41. A method of altering starch granules in a plant comprising, introducing into a 
plant at least an expression vector of claim 27, such that the starch granules are altered in 
comparison to a plant without the expression vector. 

42. A method of altering starch granules in a plant comprising, introducing into a 
plant at least an expression vector of claim 30, such that the starch granules are altered in 
comparison to a plant without the expression vector. 

43. The method of claim 42, wherein starch granules are absent from leaves of the 
plant comprising at least an expression vector. 

44. A plant part comprising a nucleic acid molecule of any of claims 1-9 or a 
nucleic acid of the method of claim 38, wherein starch synthesis is altered. 

45. The plant part of claim 44, wherein the part is a tuber, seed or leaf. 

46. The modified starch obtained from the plant parts of claim 44, wherein the 
modification is selected from the group consisting of a ratio of amylose to amylopectin, 
amylose content, size of starch granules, quantity of size of starch granules, a ratio of small to 
large starch granules, and rheological properties of the starch as measured using viscometric 
analysis. 
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SEQUENCE LISTING 

<110> Gemstar (Cambridge) Limited 

<120> Starch modification 

<130> RD-GS-1 

< 1 4 0 > unknown 
<141> unknown 

i 

<150> 60/346,907 
<151> 08-01-02 

<150> GB 0119342,4 
<151> 08-08-2001 

<160> 35 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 3750 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CAAT_signal 
<222> (373) . . (376) 

<220> 

<221> TATA_signal 
<222> (424) . . (428) 

<220> 

<221> intron 
<222> (593) . . (680) 

<220> 

<221> intron 

<222> (919) . . (1038) 

<220> 

<221> intron 

<222> (1656) - . (1761) 

<220> 

<221> intron - - - - 

<222> (2537) . . (2990) 

<400> 1 

aatatgtaca tgcaataaaa catagtaata tatttctttc cactatatat atatattgaa 60 

ttcaatgact taaaaccttt caaaaaaata tttttgctta tataatcaag tgagttattg 12 

gtaaagtgta tctttatttt gaaaaaaaaa ctcattattt tgaaaataaa ttatggttct 18 
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ctttacaaag aaatgatcaa agtttggtgg acatatatat gtcaatcata agagagtcac 240 
aaactgagaa tggagtttaa actaaagagc tacaatatta tccacaattt aaaacatttt 300 
attaaaatca cgataacttc aaaaagagaa aatcaaaaat taactttgtt aaaaaggtgg 360 
gtatgaaaaa tacaattttc ttatttccta acaaaaacaa aaatagaaac aaaggaaatg 420 
tgatataaga agattaaaag agacgttatg tctcacctat atttgctctc tcctottcct 480 
tgtccaattc tactgtccca atccatcagt tttatatggc aaactctcoc gctgctcctg 540 
cacccaccac cacaaccggt ggtgactccc ggcgacgcct ctccgcgtcc atgtaagtgt 600 
atagtataat actctctaag taatgattaa aaaaatctga acaaaatcgt ctaattgtgg 660 
ct ttgtgtgt gtttaagcag agaagcaata tgcaagagga gattccggag aaatagcaaa 720 
ggaggtggca gatcggatat ggtgaaaccg tttaatatca taaatttttc gacacaagac 780 
aaaaacagta gttgttgttg tttcaccaag tttcagatcg tgaagcttct cttgtttatc 840 
cttctctctg ccactctctt caccattatc tattctcctg aagcttatca tcattctctt 900 
tcccactcat cttctcggta aatctatttc ttttttccat caccaacatt tacattcttg 960 
acctcaaaaa tgttcacatg caaattttta cttttgcctc tatctcttat aatactatct 1020 
taaaattatg aaattagatg gatatggaga agacaagatc cacgttactt ctcggatctg 1080 
gatataaact gggacgatgt gactaaaacc cttgagaaca tcgaagaagg ccgtacgatc 1140 
ggtgtcttga attttgattc gaacgagatc caacgatgga gagaagtatc caagagcaag 1200 
gacaatgggg atgaagaaaa agttgttgta ttgaatctag attacgcaga <=aagaatgtg 1260 
acttgggacg cactatatcc agagtggatc gatgaggagc aagaaacaga ggtccctgtt 1320 
tgtcctaata tcccgaacat taaggtacct acaagaagac tcgatctgat cgtcgtgaaa 13 80 
cttccttgtc ggaaagaagg gaattggtcg agagacgtcg ggagattgca tctacagcta 1440 
gcggctgcaa ctgtggcggc ttcggccaaa gggtttttca ggggacatgt gttttttgta 1500 
tctagatgct ttccgattcc gaatcttttc cggtgtaaag atcttgtgtc tcggagaggc 1S60 
gatgtttggt tgtacaaacc taatcttgat accttgagag acaagcttca gctgcctgta 1620 
gggtcttgtg agctatctct tcctcttggc atccaaggta gaataaaaat gactcccgaa 1680 
attacttgtt tagatttgaa aacaaatttg aaaaatcgtc gctaagttaa ctagtgtctg 1740 
ttttcttcca tgaafctttac agataggcca agcttaggaa accctaaaag agaagcttac 1800 
gcaacaattc ttcactcagc tcacgtttac gtctgcggtg caatcgccgc ggctcagagc 1860 
ataagacagt ctggttcgac gagagacctt gttatccttg ttgatgacaa catcagcggt 1920 
taccaccgga gtggactaga agccgcgggt tggcaaatcc ggacgataca gaggattcga 198 0 
aaccctaagg cagagaaaga tgcttacaac gaatggaact acagcaagtt ccggctatgg 2040 
cagctgactg attacgacaa aatcattttc atcgacgcgg atctcttaat cttgagaaac 2100 
atcgatttct tgttctcgat gcctgagatc tcagctacag gaaacaatgg aactctgttt 2160 
aattcaggag ttatggtgat cgagccttgc aactgtacgt ttcagcttct gatggaacat 2220 
ataaacgaga ttgagtctta taacggtgga gatcaaggtt acttaaacga ggtattcaca 2280 
tggtggcacc ggattccaaa acatatgaat ttcttgaagc atttttggat tggcgatgaa 2340 
gatgacgcga aacgcaagaa aacagagctt tttggagcag agcctcctgt tctttatgtt 2400 
cttcattacc ttgggatgaa gccgtggtta tgttaccgtg actacgactg taacttcaac 2460 
tccgacatat tcgttgagtt tgctaccgat atcgctcatc gaaaatggtg gatggtccac 2520 
gacgccatgc cacaggtgat tcactctctc ctaaaaacct taatagaact caaaaatcac 2580 
ataatatttt caatctcata ttgtgatcaa tattcaaaat attattaggc gtttagtcat 2640 
gcgttgagag actaactgca tagcattatt tctttctcaa aaatttccaa aacttgaaaa 2700 
aataaataaa ctaaaaatta cttactaccc aagtttagaa taaccatatg aaatttgaat 2760 
atacgaaaat cttggtgggt tagtaaatgc agaattagcc ccctacgcag taggcatcaa 2820 
gttttaatgt ctatgtttta tacaccttat aaaaaaatca tttcaaattt tctttcttta 2880 
tgattagttt aaaaaaacat tggttggcag aaatataaaa atagttagac gttttcccaa 2940 
attattctaa aattgtgacg gttagtaatt accatatatg atattttgca ggaacttcac 3OO0 
caattctgtt acttgcgatc caagcaaaag gcacagctgg aatatgatcg ccggcaagca 3060 
gaggccgcaa attatgccga cggtcattgg aaaataagag taaaggaccc gagattcaaa 3120 
atttgcatcg acaaattatg taattggaaa agtatgctgc ggcattgggg cgaatcaaat 3180 
tggactgact acgagtcttt tgttcccacc ccaccagcca ttaccgtaga ccggagatca 3240 
tcacttcccg gccataactt gtgacgcaat aattatacat acttattaat ggatttca.g 3300 
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agttttttgg tttgaattgt tgctgcgaga ttaggtgaat atcagttgtg taactatatc 3360 
tttttcctat agtttgttca aattgaataa aacatttttt tgcagtttaa ccacaaaata 3420 
aaacatatgt cgtatttata tgccattttt gtatacaaac acaaactcaa aaatgttagt 3480 
aacattcaaa tagtttatac agaaacgata gattatagac ttacatatag ccaaacaaca 3540 
caaattaatt gatgtaacta aacatatgta gtataattaa actttcgaaa aatccaaatt 3 600 
tttagtcgaa tcgcagtgta gtatgtatac attacgtata gtatataaat ctatgtgtgt 3660 
gtatatcagt gtatgtattt gtgtatgtat gtacatgtga aaagaatctc tactaaagat 3720 
ttccataata ttcaaccaaa aaccaaagtt , J ' ~' %J 



<210> 2 
<211> 1980 
<2X2> DNA 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (1) - . (1980) 

<220> 

<221> transit_peptide 
<222> (1) - . (195) 

<400> 2 

atg gca aac tct ccc get get cct gca ccc ace ace aca ace ggt ggt 48 

Met Ala Asn Ser Pro Ala Ala Pro Ala Pro Thr Thr Thr Thr Gly Gly 

15 10 15 



gac tec egg cga cgc etc tec gcg tec ata gaa gca ata tgc aag agg 
Asp Ser Arg Arg Arg Leu Ser Ala Ser lie Glu Ala He Cys Lys Arg 



20 



25 30 



aga ttc egg aga aat age aaa gga ggt ggc aga teg gat atg gtg aaa 
Arg Phe Arg Arg Asn Ser Lys Gly Gly Gly Arg Ser Asp Met Val Lys 

40 45 



35 



ccg ttt aat ate ata aat ttt teg aca caa gac aaa aac agt agt tgt 

Pro Phe Asn He He Asn Phe Ser Thr Gin Asp Lys Asn Ser Ser Cys 

50 55 60 

tgt tgt ttc ace aag ttt cag ate gtg aag ctt etc ttg ttt ate ctt 

Cys Cys Phe Thr Lys Phe Gin He Val Lys Leu Leu Leu Phe He Leu 



65 



70 75 80 



etc tct gec act etc ttc ace att ate tat tct cct gaa get tat cat 

Leu Ser Ala Thr Leu Phe Thr He He Tyr Ser Pro Glu Ala Tyr His 
85 90 95 

cat tct ctt tec cac tea tct tct egg tgg ata tgg aga aga caa gat 

His Ser Leu Ser His Ser Ser Ser Arg Trp He Trp Arg Arg Gin Asp 
100 105 110 



96 



144 



192 



240 



288 



336 
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cca cgt tac ttc teg gat ctg gat ata aac tgg gac gat gtg act aaa 
Pro Arg Tyr Phe Ser Asp Leu Asp He Asn Trp Asp Asp Val Thr Lys 



115 



120 125 



acc ctt gag aac ate gaa gaa ggc cgt aeg ate ggt gtc ttg aat ttt 
Thr Leu Glu Asn He Glu Glu Gly Arg Thr He Gly Val Leu Asn Phe 
130 135 140 

gat teg aac gag ate caa cga tgg aga gaa gta tec aag age aag gac 
Asp Ser Asn Glu He Gin Arg Trp Arg Glu Val Ser Lys Ser Lys Asp 
145 150 155 160 

aat ggg gat gaa gaa aaa gtt gtt gta ttg aat eta gat tac gca gac 
Asn Gly Asp Glu Glu Lys Val Val Val Leu Asn Leu Asp Tyr Ala Asp 
165 170 175 

aag aat gtg act tgg gac gca eta tat cca gag tgg ate gat gag gag 
Lys Asn Val Thr Trp Asp Ala Leu Tyr Pro Glu Trp He Asp Glu t3lu 
180 185 190 

caa gaa aca gag gtc cct gtt tgt cct aat ate ccg aac att aag gta 
Gin Glu Thr Glu Val Pro Val Cys Pro Asn He Pro Asn He Lys Val 
195 200 205 

cct aca aga aga etc gat ctg ate gtc gtg aaa ctt cct tgt egg aaa 
Pro Thr Arg Arg Leu Asp Leu He Val Val Lys Leu Pro Cys Arg Lys 
210 215 220 

gaa ggg aat tgg teg aga gac gtc ggg aga ttg cat eta cag eta gcg 
Glu Gly Asn Trp Ser Arg Asp Val Gly Arg Leu His Leu Gin Leu Ala 
225 230 235 240 

get gca act gtg gcg get teg gec aaa ggg ttt ttc agg gga cat gtg 
Ala Ala Thr Val Ala Ala Ser Ala Lys Gly Phe Phe Arg Gly His Val 
245 250 255 

ttt ttt gta tct aga tgc ttt ccg att ccg aat ctt ttc egg tgt aaa 
Phe Phe Val Ser Arg Cys Phe Pro He Pro Asn Leu Phe Arg Cys Lys 
260 265 270 

gat ctt gtg tct egg aga ggc gat gtt tgg ttg tac aaa cct aat ctt 
Asp Leu Val Ser Arg Arg Gly Asp Val Trp Leu Tyr Lys Pro Asn Leu 
275 280 285 

gat acc ttg aga gac aag ctt cag ctg cct gta ggg tct tgt gag eta 
Asp Thr Leu Arg Asp Lys Leu Gin Leu Pro Val Gly Ser Cys Glu Leu 
290 295 300 

tct ctt cct ctt ggc ate caa gat agg cca age tta gga aac cct aaa 
Ser Leu Pro Leu Gly He Gin Asp Arg Pro Ser Leu Gly Asn Pro Lys 
305 310 315 320 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



960 
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aga gaa get tac gca aca att ctt cac tea get cac gtt tac gtc tgc 1008 

Arg Glu Ala Tyr Ala Thr He Leu His Ser Ala His Val Tyr Val Cys 
325 330 335 

ggt gca ate gec gcg get cag age ata aga cag tct ggt teg acg aga 1056 
Gly Ala He Ala Ala Ala Gin Ser He Arg Gin Ser Gly Ser Thr Arg 
340 345 350 

gac ctt gtt ate ctt gtt gat gac aac ate age ggt tac cac egg agt 1104 
Asp Leu Val lie Leu Val Asp Asp Asn He Ser Gly Tyr His Arg Ser 
\ 355 360 365 

gga eta gaa gee gcg ggt tgg caa ate egg acg ata cag agg att cga 1152 
Gly Leu Glu Ala Ala Gly Trp Gin He Arg Thr He Gin Arg He Arg 
370 375 380 



aac cct aag gca gag aaa gat get tac aac gaa tgg aac tac age aag 
Asn Pro Lys Ala Glu Lys Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys 

~ i- a r\n 



385 



390 395 400 



ttc egg eta tgg cag ctg act gat tac gac aaa ate att ttc ate gac 
Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Lys He He Phe lie Asp 
405 410 415 

gcg gat etc tta ate ttg aga aac ate gat ttc ttg ttc teg atg cct 
Ala Asp Leu Leu He Leu Arg Asn He Asp Phe Leu Phe Ser Met Pro 
420 ^25 430 

gag ate tea get aca gga aac aat gga act ctg ttt aat tea gga gtt 
Glu He Ser Ala Thr Gly Asn Asn Gly Thr Leu Phe Asn Ser Gly Val 
435 440 i 445 

atg gtg ate gag cct tgc aac tgt acg ttt cag ctt ctg atg gaa cat 
Met Val He Glu Pro Cys Asn Cys Thr Phe Gin Leu Leu Met Glu His 
450 455 460 

ata aac gag att gag tct tat aac ggt gga gat caa ggt tac tta aac 
He Asn Glu He Glu Ser Tyr Asn Gly Gly Asp Gin Gly Tyr Leu Asn 
465 470 475 480 

gag gta ttc aca tgg tgg cac egg att cca aaa cat atg aat ttc ttg 
Glu Val Phe Thr Trp Trp His Arg lie Pro Lys His Met Asn Phe Leu 
485 490 495 

aag cat ttt tgg att ggc gat gaa gat gac gcg aaa cgc aag aaa aca 
Lys His Phe Trp He Gly Asp Glu Asp Asp Ala Lys Arg Lys Lys Thr 
500 505 510 

gag ctt ttt gga gca gag cct cct gtt ctt tat gtt ctt cat tac ctt 
Glu Leu Phe Gly Ala Glu Pro Pro Val Leu Tyr Val Leu His Tyr Leu 
515 520 525 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



1584 
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gag gcc gca aat tat gcc gac ggt cat tgg aaa ata aga gta aag gac 
Glu Ala Ala Asn Tyr Ala Asp Gly His Trp Lys lie Arg Val Lys Asp 
595 600 605 

ccg aga ttc aaa att tgc ate gac aaa tta tgt aat tgg aaa agt atg 
Pro Arg Phe Lys lie Cys lie Asp Lys Leu Cys Asn Trp Lys Ser Met 
610 615 620 

ctg egg cat tgg ggc gaa tea aat tgg act gac tac gag tct ttt gtt 
Leu Arg His Trp Gly Glu Ser Asn Trp Thr Asp Tyr Glu Ser Phe Val 
625 630 635 640 

ccc acc cca cca.gcc att ace gta gac egg aga tea tea ctt cec gge 
Pro Thr Pro Pro Ala lie Thr Val Asp Arg Arg Ser Ser Leu Pro Gly 
645 650 655 

cat aac ttg tga 
His Asn Leu * 



1632 



168 0 



6 

ggg a tg aag ccg tgg tta tgt tac cgt gac tac gae tgt aac ttc aac 

Gly Met Lys Pro Trp Leu Cys Tyr Arg Asp Tyr Asp Cys Asn Phe Asn 
530 535 540 

tec gac ata ttc gtt gag ttt get acc gat ate get cat cga aaa tgg 

Ser Asp He Phe Val Glu Phe Ala Thr Asp He Ala His Arg Lys Trp 

545 550 555 560 

tgg atg gtc cac gac gcc atg cca cag gaa ctt cac caa ttc tgt tac 172 8 

Trp Met Val His Asp Ala Met Pro Gin Glu Leu His Gin Phe Cys Tyr 
565 570 575 

ttg cga tec aag caa aag gca cag ctg gaa tat gat cgc egg caa gca 1776 
Leu Arg Ser Lys Gin Lys Ala Gin Leu Glu Tyr Asp Arg Arg Gin Ala 
580 585 590 



1824 



1872 



1920 



1968 



1980 



<210> 3 
<211> 659 
<212> PRT 

<213> Arabidopsis thaliana 



<400> 3 



Met 


Ala 


Asn 


Ser 


Pro 


Ala 


Ala 


Pro, Ala Pro Thr 


Thr 


Thr 


Thr 


Gly 


Gly 


1 








5 






10 








15 




Asp 


Ser 


Arg 


Arg 


Arg 


Leu 


Ser 


Ala Ser He Glu 


Ala 


He 


Cys 


Lys 


Arg 






20 








25 






30 






Arg 


Phe 


Arg 


Arg 


Asn 


Ser 


Lys 


Gly Gly Gly Arg 


Ser 


Asp 


Met 


Val 


Lys 




35 










40 




45 








Pro 


Phe 


Asn 


He 


He 


Asn 


Phe 


Ser Thr Gin Asp 


Lys 


Asn 


Ser 


Ser 


Cys 




50 










55 




60 










Cys 


Cys 


Phe 


Thr 


Lys 


Phe 


Gin 


He Val Lys Leu 


Leu 


Leu 


Phe 


He 


Leu 


65 








70 




75 










80 
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Leu Ser Ala Thr 

His Ser Leu Ser 
100 

Pro Arg Tyr Phe 
115 

Thr Leu Glu Asn 
130 

Xsp Ser Asn Glu 
145 

Asn Gly Asp Glu 

Lys Asn Val Thr 
180 

Gin Glu Thr Glu 
195 

Pro Thr Arg Arg 
210 

Glu Gly Asn Trp 
225 

Ala Ala Thr Val 

Phe Phe Val Ser 
260 

Asp Leu Val Ser 
275 

Asp Thr Leu Arg 
290 

Ser Leu Pro Leu 
305 

Arg Glu Ala Tyr 

Gly Ala lie Ala 
340 

Asp Leu Val lie 
355 

Gly Leu Glu Ala 
370 

Asn Pro Lys Ala 
385 

Phe Arg Leu Trp 

Ala Asp Leu Leu 
420 

Glu lie Ser Ala 
435 

Met Val lie Glu 
450 

lie Asn Glu lie 
465 

Glu Val Phe Thr 



Leu Phe Thr lie 
85 

His Ser Ser Ser 

Ser Asp Leu Asp 
120 

lie Glu Glu Gly 
135 

lie Gin Arg Trp 
150 

Glu Lys Val Val 
165 

Trp Asp Ala Leu 

Val Pro Val Cys 
200 

Leu Asp Leu lie 
215 

Ser Arg Asp Val 
230 

Ala Ala Ser Ala 
245 

Arg Cys Phe Pro 

Arg Arg Gly Asp 
280 

Asp Lys Leu Gin 
295 

Gly lie Gin Asp 
310 

Ala Thr lie Leu 
325 

Ala Ala Gin Ser 

Leu Val Asp Asp 
360 

Ala Gly Trp Gin 
375 

Glu Lys Asp Ala 
390 

Gin Leu Thr Asp 
405 

lie Leu Arg Asn 

Thr Gly Asn Asn 
440 

Pro Cys Asn Cys 
455 

Glu Ser Tyr Asn 
470 

Trp Trp His Arg 
485 



7 

lie Tyr Ser Pro 
90 

Arg Trp lie Trp 
105 

He Asn Trp Asp 

Arg Thr He Gly 
140 

Arg Glu Val Ser 
155 

Val Leu Asn Leu 
170 

Tyr Pro Glu Trp 
185 

Pro Asn He Pro 

Val Val Lys Leu 
220 

Gly Arg Leu His 
235 

Lys Gly Phe Phe 
250 

lie Pro Asn Leu 
265 

Val Trp Leu Tyr 

Leu Pro Val Gly 
300 

Arg Pro Ser Leu 
315 

His Ser Ala His 
330 

He Arg Gin Ser 
345 

Asn He Ser Gly 

He Arg Thr He 
380 

Tyr Asn Glu Trp 
395 

Tyr Asp Lys He 
410 

He Asp Phe Leu 
425 

Gly Thr Leu Phe 

Thr Phe Gin Leu 
460 

Gly Gly Asp Gin 
475 

He Pro Lys His 
490 



Glu Ala Tyr His 
95 

Arg Arg Gin Asp 
110 

Asp Val Thr Lys 
125 

Val Leu Asn Phe 

Lys Ser Lys Asp 
160 

Asp Tyr Ala Asp 
175 

He Asp Glu Glu 
190 

Asn He Lys Val 
205 

Pro" Cys Arg Lys 

Leu Gin Leu Ala 
240 

Arg Gly His Val 
255 

Phe Arg Cys Lys 
270 

Lys Pro Asn Leu 
285 

Ser Cys Glu Leu 

Gly Asn Pro Lys 
32 0 

Val Tyr Val Cys 
335 

Gly Ser Thr Arg 
350 

Tyr His Arg Ser 
365 

Gin Arg He Arg 

Asn Tyr Ser Lys 
400 

He Phe He Asp 
415 

Phe Ser Met Pro 
430 

Asn Ser Gly Val 
445 

Leu Met Glu His 

Gly Tyr Leu Asn 
4 80 

Met Asn Phe Leu 
495 
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Lys His Phe Trp lie 
500 

Glu Leu Phe Gly Ala 
515 

Gly Met Lys Pro Trp 
530 

Ser Asp He Phe Val 
545 

Trp Met Val His Asp 
565 

Leu Arg Ser Lys Gin 
580 

Glu Ala Ala Asn Tyr 
595 

Pro Arg Phe Lys He 
610 

Leu Arg His Trp Gly 
625 

Pro Thr Pro Pro Ala 
645 

His Asn Leu 



8 

Gly Asp Glu Asp Asp Ala 
505 

Glu Pro Pro Val Leu Tyr 
520 

Leu Cys Tyr Arg Asp Tyr 
535 

Glu Phe Ala Thr Asp lie 
550 555 
Ala Met Pro Gin Glu Leu 
570 

Lys Ala Gin Leu Glu Tyr 
585 

Ala Asp Gly His Trp Lys 
600 

Cys He Asp Lys Leu Cys 
615 

Glu Ser Asn Trp Thr Asp 
630 635 
He Thr Val Asp Arg Arg 
650 



Lys Arg Lys Lys Thr 
510 

Val Leu His Tyr Leu 
525 

Asp Cys Asn Phe Asn 
540 

Ala His Arg Lys Trp 
560 

His Gin Phe Cys Tyr 
575 

Asp Arg Arg Gin Ala 
590 

He Arg Val Lys Asp 
605 

Asn Trp Lys Ser Met 
620 

Tyr Glu Ser Phe Val 
64 0 

Ser Ser Leu Pro Gly 
655 



<210> 4 
<211> 560 
<212> DNA 
<213> Zea mays 



<400> 4 

aaaattagca gcagccacag caagaggcaa 

ttgcttccca attccaaacc tcttctcttg 

ttggatgtac aaacctgacg tgaaggctct 

ctgtgagctt gctgttccac tcaacgcaaa 

agaagcatat gctacaatac tgcattcagc 

agctcaaagc attcgtcaag caggatcaac 

cataagtgac taccaccgca aggggctgga 

gaggatccgg aatcccaaag cggaacgtga 

ccggctgtgg cagcttacag attacgacaa 
cctgaggaac attgatttct 



tagaggaatt catgtgctgt ttctgactga 60 
caaggaccta gtgaaacgtg aaggcaatgc 12 0 
aaaggagaag ctcaggctgc ctgttggttc 180 
agcacgactc tacacggtag acagacgcag 24 0 
aagtgaatat gtttgcggtg cgataacagc 300 
aagagacctt gttattcttg ttgatgacac 360 
atctgctggg tggaaggtta gaataataca 42 0 
tgcctacaac gaatggaact acagcaaatt 480 
ggttattttc attgatgctg atctgctcat 540 

560 



<210> 5 
<211> 1034 
<212> DNA 
<213> Zea mays 



<400> 5 

gacgcgtaca acgagtggaa ctacagcaag 

aaggtcatct tcatagacgc cgacctcctc 

atgccggaga tcgccgcgac ggscaacaac 

gtcgagccct ccaactgcac gttccgcctg 



ttcaggctgt ggcagctgac cgactacgac 60 
atcctgagga acgtcgactt cctgttcgcc 120 
gccacgctct tcaactccgg cgtcatggtc 180 
ctcatggacc acatcgacga gatcacctcg 240 
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tacaacggcg gggaccaggg gtacctcaac 
aggcacatga acttcctcaa gcacttctgg 
aagacacagc tgttcggcgc ggacccgccg 
aagccgtggc tgtgcttcag agactacgac 
ttcgccagcg acgtcgcgca tgcccggtgg 
ctccagtcct actgcctgct gaggtcgcgg 
caggccgaga aggccaactc tcaagatggc 
ctcaagacgt gctttgagaa gttctgcttc 
aacagtaaca ggaccaagag cgtccccatg 
gatatacgaa caccccatcc ccatatggca 
gtagctatgc tttagttctt cgctatatat 
tcaaggctgc agctctatgt cgctgccggc 
tggctgctgt aataagtttc aggtacatgt 
ttgagaaatg aatt 



9 

gagatattca cgtggtggca ccgcgtcccc 300 
gagggcgaca gcgaggccat gaaggcgaag 360 
gtcctctacg tcctccacta ccttggcctc 420 
tgcaactgga acaacgccgg gatgcgcgag 480 
tggaaggtgc acgacaggat gccccggaag 54 0 
cagaaggcca ggctggagtg ggaccggagg 600 
cactggcgcc tcaacgtcac ggacaccagg 660 
tgggagagca tgctctggca ttggggcgag 72 0 
gcagccacga cggcaaggtc gtgatctgta 780 
accatacatg catagcaata gcttgtatag 84 0 
acagaataca ccactcgatc cctgttgttg 900 
ctgccaccat ggctaacgat tcttttgggt 960 
aaatttccct gctgaaatta cgtgaccgcg 1020 

1034 



<210> 6 
<211> 3606 
<212> DNA 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (1) . - (3606) 

<400> 6 

atg tgt gtc aac ttc tct agt ctg aaa ctt gtt ttg ttt ctt atg atg 

Met Cys Val Asn Phe Ser Ser Leu Lys Leu Val Leu Phe Leu Met Met 
1 5 10 15 

ctg gtt get atg ttc aca etc tac tgt tct cca ccg ttg caa att cct 
Leu Val Ala Met Phe Thr Leu Tyr Cys Ser Pro Pro Leu Gin He Pro 
20 25 30 

gaa gat cca tea agt ttt gca aac aaa tgg ata eta gaa cct get gta 
Glu Asp Pro Ser Ser Phe Ala Asn Lys Trp He Leu Glu Pro Ala Val 
35 40 45 

acc aca gat cct cgc tat ata get aca tct gag ate aac tgg aac agt 
Thr Thr Asp Pro Arg Tyr He Ala Thr Ser Glu He Asn Trp Asn Ser 
50 * 55 60 

atg tea ctt gtt gtt gag cat tac tta tct ggc aga age gag tat caa 
Met Ser Leu Val Val Glu His Tyr Leu Ser Gly Arg Ser Glu Tyr Gin 
65 70 75 80 

gga att ggc ttt eta aat etc aac gat aac gag att aat cga tgg cag 
Gly He Gly Phe Leu Asn Leu Asn Asp" Asn Glu He Asn Arg Trp Gin 
85 90 . 95 

gtg gtc ata aaa tct cac tgt cag cat ata get ttg eat eta gac cat 
Val Val He Lys Ser His Cys Gin His He Ala Leu His Leu Asp His 



48 



96 



144 



192 



240 



288 



336 
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100 105 110 

get gca agt aac ata act tgg aaa tct tta tac ccg gaa tgg att gac 

Ala Ala Ser Asn He Thr Trp Lys Ser Leu Tyr Pro Glu Trp He Asp 

115 120 125 

gag gaa gaa aaa ttc aaa gtc ccc act tgt cct tct ctt cct tgg att 

Glu Glu Glu Lys Phe Lys Val Pro Thr Cys Pro Ser Leu Pro Trp He 

130 135 140 

caa gtt cct gac aag tct cga ate gat ctt ate att gee aag etc cca 

Gin Val Pro Asp Lys Ser Arg He Asp Leu He He Ala Lys Leu Pro 

145 150 155 160 

tgt aac aag tea gga aaa tgg tea aga gat gtg get aga ttg cac tta 528 

Cys Asn Lys Ser Gly Lys Trp Ser Arg Asp Val Ala Arg Leu His Leu 

165 170 175 

caa ctt gca gca get cga gtg gcg gca tct tct gaa ggg ctt cat gat 576 

Gin Leu Ala Ala Ala Arg Val Ala Ala Ser Ser Glu Gly Leu His Asp 

180 185 190 



gtt cat gtg att ttg gta tea gat tgc ttt cca ata ccg aat ctt ttt 
Val His Val He Leu Val Ser Asp Cys Phe Pro He Pro Asn Leu Phe 
195 200 205 



tgt gaa ctt tct gtt cct ctt caa get aaa gat -aat ttc tac teg gca 

Cys Glu Leu Ser Val Pro Leu Gin Ala Lys Asp Asn Phe Tyr Ser Ala 

24 5 '25 0 2 55 

aat gee aag aaa gaa gcg tac gcg acg ate ttg cac tea gat gat get 

Asn Ala Lys Lys Glu Ala Tyr Ala Thr He Leu His Ser Asp Asp Ala 

260 265 270 

ttt gtc tgt gga gee att gca gta gca cag age att cga atg tea ggc 

Phe Val Cys Gly Ala He Ala Val Ala Gin Ser He Arg Met Ser Gly 
275 280 285 

tct act cgc aat ttg gta ata eta gtc gat gat teg ate agt gaa tac 

Ser Thr Arg Asn Leu Val He Leu Val Asp Asp Ser He Ser Glu Tyr 
290 295 300 

cat aga agt ggc ttg gaa tea get gga tgg aag att cac aca ttt caa 

His Arg Ser Gly Leu Glu Ser Ala Gly Trp Lys He His Thr Phe Gin 
305 310 315 32^0 



384 



432 



480 



624 



acg ggt caa gaa ctt gtt gee cgt caa gga aac ata tgg ctg tat aag 672 

Thr Gly Gin Glu Leu Val Ala Arg Gin Gly Asn He Trp Leu Tyr Lys 
210 215 220 

cct aaa ctt cac cag tta aga caa aag tta caa ctt cct gtt ggt tec 720 

Pro Lys Leu His Gin Leu Arg Gin Lys Leu Gin Leu Pro Val Gly Ser 
225 230 235 240 



768 



816 



864 



912 



960 
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X008 



aga ate aga aac ccg aaa get gaa gca aat gca tat aac caa tgg aac 
Arg lie Arg Asn Pro Lys Ala Glu Ala Asn Ala Tyr Asn Gin Trp Asn 
325 330 335 

tac age aaa ttc cgt ctt tgg gaa ttg aca gaa tac aac aag ate ate 
Tyr Ser Lys Phe Arg Leu Trp Glu Leu Thr Glu Tyr Asn Lys He He 



340 



345 350 



355 



gag tac ccc gaa ate tec aca act gga aac gac ggt acg etc ttc aac 
Glu Tyr Pro Glu lie Ser Thr Thr Gly Asn Asp Gly Thr Leu Phe Asn 
370 375 380 

tec ggt eta ate qtq att gaa cca tea aat tea aca ttc cag tta eta 
Ser Glv Leu Met Val 11c Glu Pro Ser Asn Ser Thr Phe Gin Leu Leu 
385 ' 39C 395 400 

atg gat cac ate aac gat ate aat tec tac aat gga gga gac caa ggt 
Met Asp His lie Asr. Asp He Asn Ser Tyr Asn Gly Gly Asp Gin Gly 
405 410 415 

tac ctt aac gag ata ttc aca tgg tgg cat egg att cca aaa cac atg 
Tyr Leu Asn Glu lie Phe Thr Trp Trp His Arg He Pro Lys His Met 
420 425 430 

aat ttc ttg aag cat ttc tgg gaa gga gac aca cct aag cac agg aaa 
Asn Phe Leu Lys His Phe Trp Glu Gly Asp Thr Pro Lys His Arg Lys 
435 440 445 

tct aag acg aga eta ttt gga get gat cct ccg ata etc tac gtt ctt 
Ser Lys Thr Arg Leu Phe Gly Ala Asp Pro Pro He Leu Tyr Val Leu 
450 455 460 

cat tac eta ggt tac aac aaa cca tgg gta tgc ttc aga gac tac gat 
His Tyr Leu Gly Tyr Asn Lys Pro Trp Val Cys Phe Arg Asp Tyr Asp 
465 470 475 480 

tgc aat tgg aat gtc gtt gga tac cat caa ttc gcg age gat gaa gca 
Cys Asn Trp Asn Val Val Gly Tyr His Gin Phe Ala Ser Asp Glu Ala 
y 485 490 495 

cac aaa act tgg tgg aga gtg cac gac gcg atg cct aag aaa ttg cag 
His Lys Thr Trp Trp Arg Val His Asp Ala Met Pro Lys Lys Leu Gin 
500 505 510 

agg ttt tgt eta ctg agt teg aaa caa aag gcg caa ctt gag tgg gat 
Arg Phe Cys Leu Leu Ser Ser Lys Gin Lys Ala Gin Leu Glu Trp Asp 
515 520 525 



1056 



ttc att gat gca gac atg ctt ate etc aga aac atg gat ttc etc ttc 1104 
Phe He Aso Ala Asp Met Leu He Leu Arg Asn Met Asp Phe Leu Phe 

360 365 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



1584 
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egg aga caa get gag aaa gcg aat tac aga gac gga cat tgg agg att 
Arg Arg Gin Ala Glu Lys Ala Asn Tyr Arg Asp Gly His Trp Arg He 
530 535 540 

aag ate aaa gat aag aga ctt acg act tgt ttt gaa gat ttc tgt ttc 
Lys He Lys Asp Lys Arg Leu Thr Thr cys Phe Glu Asp Phe Cys Phe 
545 550 555 560 

tog gag agt atg ctt tgg cat tgg ggc gat tat gaa att etc gaa acc 
Tru Glu Ser Met Leu Trp His Trp Gly Asp Tyr Glu He Leu Glu Thr 
565 570 575 

gac cct ggt ctt acg gag acg atg ata cct tec tea agt ccc atg gag 
Asp Pro Gly Leu Thr Glu Thr Met He Pro Ser Ser Ser Pro Met Glu 
580 585 590 

tea aga cat cga etc teg ttc tea aat gag aag aca agt agg agg aga 
Ser Arg His Arg Leu Ser Phe Ser Asn Glu Lys Thr Ser Arg Arg Arg 
595 600 605 

ttt caa aga att gag aag ggt gtc aag ttc aac act ctg aaa ctt gtg 
Phe Gin Arg He Glu Lys Gly Val Lys Phe Asn Thr Leu Lys Leu Val 
610 615 620 

ttg att tgt ata atg ctt gga get ttg ttc acg ate tac cgt ttt cgt 
Leu He Cys He Met Leu Gly Ala Leu Phe Thr He Tyr Arg Phe Arg 
625 630 635 • 640 

tat cca ccg eta caa att cct gaa att cca act agt ttt ggt ctt act 
Tyr Pro Pro Leu Gin He Pro Glu He Pro Thr Ser Phe Gly Leu Thr 



645 



650 655 



act gat cct cgc tat gta get aca get gag ate aac tgg aac cat atg 
Thr Asp Pro Arg Tyr Val Ala Thr Ala Glu He Asn Trp Asn His Met 
" 665 670 



660 



tea aat ctt gtt gag aag cac gta ttt ggt aga age gag tat caa gga 
Ser Asn Leu Val Glu Lys His Val Phe Gly Arg Ser Glu Tyr Gin Gly 

680 685 



675 



att ggt ctt ata aat ctt aac gat aac gag att gat cga ttc aag gag 
He Gly Leu He Asn Leu Asn Asp. Asn Glu He Asp Arg Phe Lys Glu 

695 700 



690 



gta acg aaa tct gac tgt gat cat gta get ttg cat eta gat tat get 
Val Thr Lys Ser Asp Cys Asp His Val Ala Leu His Leu Asp Tyr Ala 
~ " ' 71S 720 



705 710 



gca aag aac ata aca tgg gaa tct tta tac ccg gaa tgg att gat gaa 
Ala Lys Asn He Thr Trp Glu Ser Leu Tyr Pro Glu Trp He Asp Glu 
7^ 730 735 



1632 



1680 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2 064 



2112 



2160 



2208 
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gtt gaa gaa ttc gaa gtc cct act tgt cct tct ctg cct ttg att caa 
Val Glu Glu Phe Glu Val Pro Thr Cys Pro Ser Leu Pro Leu lie Gin 
740 745 750 

att cct ggc aag cct egg att gat ctt gta att gec aag ctt ccg tgt 
He Pro Gly Lys Pro Arg He Asp Leu Val He Ala Lys Leu Pro Cys 
755 760 765 

gat aaa tea gga aaa tgg tct aga gat gtg get cgc ttg cat tta caa 
Asp Lys Ser Gly Lys Trp Ser Arg Asp Val Ala Arg Leu Hxs Leu Gin 
770 775 780 

ctt gca gca get cga gtg gcg get tct tct aaa gga ctt cat aat gtt 
Leu Ala Ala Ala Arg Val Ala Ala Ser Ser Lys Gly Leu Has Asn Val 
785 790 795 800 

cat qtg att ttg gta tct gat tgc ttt cca ata ccg aat ctt ttt acg 
His Val He Leu Val Ser Asp Cys Phe Pro He Pro Asn Leu Phe Thr 
805 810 815 

ggt caa gaa ctt gtt gec cgt caa gga aac ata tgg ctg tat aag cct 
Gly Gin Glu Leu Val Ala Arg Gin Gly Asn He Trp Leu Tyr Lys Pro 
820 825 830 

aat ctt cac cag eta aga caa aag tta cag ctt cct gtt ggt tec tgt 
Asn Leu His Gin Leu Arg Gin Lys Leu Gin Leu Pro Val Gly Ser Cys 
835 840 845 

gaa ctt tct gtt cct ctt caa get aaa gat aat ttc tac tec gca ggt 
Glu Leu Ser Val Pro Leu Gin Ala Lys Asp Asn Phe Tyr Ser Ala Gly 
850 855 860 



Ala Lys Lys Glu Ala Tyr Ala Thr He Leu His Ser Ala Gin Phe Tyr 
865 " 870 



2256 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



aag aaa gaa get tac gcg act ate ttg cat tct gec caa ttt tat 2640 

His Ser Ala 

875 880 



Qtc tgt gga gec att gca get gca cag age att cga atg tea ggc tct 
Val Cys Gly Ala He Ala Ala Ala Gin Ser He Arg Met Ser Gly Ser 
885 B90 695 

act cgt gat ctg gtc ata ctt gtt gat gaa acg ata age gaa tac cat 
Thr Arg Asp Leu Val He Leu Val Asp Glu Thr lie Ser Glu Tyr Hxs 
900 90S 910 

aaa agt ggc ttg gta get get gga tgg aag att caa atg ttt caa aga 
Lys Ser Gly Leu Val Ala Ala Gly Trp Lys He Gin Met Phe Gin Arg 
915 920 92S 

ate agg aac ccg aat get gta cca aat gec tac aac gaa tgg aac tac 
He Arg Asn Pro Asn Ala Val Pro Asn Ala Tyr Asn Glu Trp Asn Tyr 
930 935 940 



2688 



2736 



2784 



2832 
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age aag ttt cgt ctt tgg caa ctg act gaa tac agt aag ate ate ttc 28 80 
Ser Lys Phe Arg Leu Trp Gin Leu Thr Glu Tyr Ser Lys lie lie Phe 

955 960 



945 950 



ate gat gca gac atg ctt ate ctg aga aac att gat ttc etc ttc gag 
lie Asp Ala Asp Met Leu lie Leu Arg Asn He Asp Phe Leu Phe Glu 
965 970 975 

ttc cct gag ata tea gca act gga aac aat get acg etc ttc aac tct 
Phe Pro Glu He Ser Ala Thr Gly Asn Asn Ala Thr Leu Phe Asn Ser 
980 985 990 

ggt eta atg gtg gtt gag cca tct aat tea aca ttc cag tta eta atg 
Gly Leu Met Val Val Glu Pro Ser Asn Ser Thr Phe Gin Leu Leu Met 
995 1000 1005 

gat aac att aat gaa gtt gtg tct tac aac gga gga gac caa ggt tac 
Asp Asn He Asn Glu Val Val Ser Tyr Asn Gly Gly Asp Gin Gly Tyr 
1010 1015 1020 

ctt aac gag ata ttc aca tgg tgg cat egg att cca aaa cac atg aat 
Leu Asn Glu He Phe Thr Trp Trp His Arg He Pro Lys His Met Asn 
1025 1030 1035 1040 

ttc ttg aag cat ttc tgg gaa gga gac gaa cct gag att aaa aaa atg 
Phe Leu Lys His Phe Trp Glu Gly Asp Glu Pro Glu He Lys Lys Met 
10 45 1050 1055 

aag acg agt eta ttt gga get gat cct ccg ate eta tac gtt ctt cat 
Lys Thr Ser Leu Phe Gly Ala Asp Pro Pro He Leu Tyr Val Leu His 
1060 1065 1070 

tac eta ggt tat aac aaa ccc tgg tta tgc ttc aga gac tat gac tgc 
Tyr Leu Gly Tyr Asn Lys Pro Trp Leu Cys Phe Arg Asp Tyr Asp Cys 
10 75 1080 1085 

aat tgg aat gtc gat att ttc cag gaa ttt get agt gac gag get cat 
Asn Trp Asn Val Asp He Phe Gin Glu Phe Ala Ser Asp Glu Ala Hxs 
1090 1095 1100 

aaa acc tgg tgg aga gtg cac gac gca atg cct gaa aac ttg cat aag 
Lys Thr Trp Trp Arg Val His Asp Ala Met Pro Glu Asn Leu Has Lys 
1105 ' H10 1115 H20 

ttc tgt eta eta aga teg aaa cag aag gcg caa ctt gaa tgg gat agg 
Phe Cys Leu Leu Arg Ser Lys Gin Lys Ala Gin Leu Glu Trp Asp Arg 

* _ _ _ _ *i"i*ac 



1125 



1130 H35 



aga caa gca gag aaa ggg aac tac aaa gat gga cat tgg aag ata aag 
Arg Gin Ala Glu Lys Gly Asn Tyr Lys Asp Gly His Trp Lys He Lys 
1140 H45 H50 



2928 



2976 



3024 



3072 



3120 



3168 



3216 



3264 



3312 



3360 



3408 



3456 
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ate aaa gac aag aga ctt aag act tgt 
He Lys Asp Lys Arg Leu Lys Thr Cys 
1155 1160 

gag agt atg ctt tgg cat tgg ggt gag 
Glu Ser Met Leu Trp His Trp Gly Glu 
1170 H75 

i 

tec acc acc acc act tea tea ccg ccg 
sir Thr Thr Thr Thr Ser Ser Pro Pro 
1185 H90 

ctg tga 
Leu 



15 

ttc gaa gat ttc tgc ttt tgg 3504 
Phe Glu Asp Phe Cys Phe Trp 
1165 

acg aac tct acc aac aat tct 3552 
Thr Asn Ser Thr Asn Asn Ser 
1180 

cat aaa acc get etc cct tec 3600 
His Lys Thr Ala Leu Pro Ser 
1195 1200 

3606 



<210> 7 
<211> 1201 
<212> PRT 

<213> Arabidopsis thaliana 

Met°cyl Val Asn Phe Ser Ser Leu Lys Leu Val Leu Phe Leu Met Met 

1 5 10 i5 

Leu Val Ala Met Phe Thr Leu Tyr Cys Ser Pro Pro Leu Gin lie Pro 

20 25 30 

Glu Asp Pro Ser Ser Phe Ala Asn Lys Trp He Leu Glu Pro Ala Val 

35 40 45 

Thr Thr Asp Pro Arg Tyr He Ala Thr Ser Glu He Asn Trp Asn Ser 

50 55 60 

Met Ser Leu Val Val Glu His Tyr Leu Ser Gly Arg Ser Glu Tyr Gin 
65 70 75 80 

Glv He Gly Phe Leu Asn Leu Asn Asp Asn Glu He Asn Arg Trp Gin 

85 90 95 

Val Val He Lys Ser His Cys Gin His He Ala Leu His Leu Asp His 

100 105 HO 

Ala Ala Ser Asn He Thr Trp Lys Ser Leu Tyr Pro Glu Trp He Asp 

115 120 125 

Glu Glu Glu Lys Phe Lys Val Pro Thr Cys Pro Ser Leu Pro Trp He 

130 135 140 

Gin Val Pro Asp Lys Ser Arg He Asp Leu He He Ala Lys Leu Pro 

150 155 
Cys Asn Lys Ser Gly Lys Trp Ser Arg Asp Val Ala Arg Leu His Leu 

165 170 175 

Gin Leu Ala Ala Ala Arg Val Ala Ala Ser Ser Glu Gly Leu His Asp 

180 . . . 185 190 

Val His Val He Leu Val Ser Asp Cys Phe Pro He Pro Asn Leu Phe 

195 200 205 

Thr Gly Gin Glu Leu Val Ala Arg Gin Gly Asn He Trp Leu Tyr Lys 

210 215 220 

Pro Lys Leu His Gin Leu Arg Gin Lys Leu Gin Leu Pro Val Gly Ser 
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225 

Cys Glu Leu Ser 

Asn Ala Lys Lys 
260 

Phe Val Cys Gly 
275 

Ser Thr Arg Asn 
290 

His Arg Ser Gly 
305 

Arg He Arg Asn 

Tyr Ser Lys Phe 
340 

Phe He Asp Ala 
355 

Glu Tyr Pro Glu 
370 

Ser Gly Leu Met 
385 

Met Asp His He 

Tyr Leu Asn Glu 
420 

Asn Phe Leu Lys 
435 

Ser Lys Thr Arg 
450 

His Tyr Leu Gly 
465 

Cys Asn Trp Asn 

His Lys Thr Trp 
500 

Arg Phe Cys Leu 
515 

Arg Arg Gin Ala 
530 

Lys He Lys Asp 
545 

Trp Glu Ser Met 

Asp Pro Gly Leu 
580 

Ser Arg His Arg 
595 

Phe Gin Arg He 
610 

Leu He Cys He 
625 

Tyr Pro Pro Leu 



230 

Val Pro Leu Gin 
245 

Glu Ala Tyr Ala 

Ala He Ala Val 
280 

Leu Val He Leu 
295 

Leu Glu Ser Ala 
310 

Pro Lys Ala Glu 
325 

Arg Leu Trp Glu 

Asp Met Leu He 
360 

He Ser Thr Thr 
375 

Val He Glu Pro 
390 

Asn Asp He Asn 
405 

He Phe Thr Trp 

His Phe Trp Glu 
440 

Leu Phe Gly Ala 
455 

Tyr Asn Lys Pro 
470 

Val Val Gly Tyr 
485 

Trp Arg Val His 

Leu Ser Ser Lys 
520 

Glu Lys Ala Asn 
535 

Lys Arg Leu Thr 
550 

Leu Trp His Trp 
565 

Thr Glu Thr Met 

Leu Ser Phe Ser 
600 

Glu Lys Gly Val 
615 

Met Leu Gly Ala 
630 

Gin lie Pro Glu 
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235 

Ala Lys Asp Asn 
250 

Thr He Leu His 
265 

Ala Gin Ser He 

Val Asp Asp Ser 
300 

Gly Trp Lys He 
315 

Ala Asn Ala Tyr 
330 

Leu Thr Glu Tyr 
345 ' 

Leu Arg Asn Met 

Gly Asn Asp Gly 
380 

Ser Asn Ser Thr 
395 

Ser Tyr Asn Gly 
410 

Trp His Arg He 
425 

Gly Asp Thr Pro 

Asp Pro Pro He 
460 

Trp Val Cys Phe 
475 

His Gin Phe Ala 
490 

Asp Ala Met Pro 
505 

Gin Lys Ala Gin 

Tyr Arg Asp Gly 
540 

Thr Cys Phe Glu 
555 

Gly Asp Tyr Glu 
570 

He Pro Ser Ser 
585 

Asn Glu Lys Thr 

Lys Phe Asn Thr 
620 

Leu Phe Thr He 
635 

He Pro Thr Ser 



240 

Phe Tyr Ser Ala 
255 

Ser Asp Asp Ala 
270 

Arg Met Ser Gly 
285 

He Ser Glu Tyr 

His Thr Phe Gin 
320 

Asn Gin Trp Asn 
335 

Asn Lys He He 
350 

Asp Phe Leu Phe 
365 

Thr Leu Phe Asn 

Phe Gin Leu Leu 
4 00 

Gly Asp Gin Gly 
415 

Pro Lys His Met 
430 

Lys His Arg Lys 
445 

Leu Tyr Val Leu 

Arg Asp Tyr Asp 
480 

Ser Asp Glu Ala 
495 

Lys Lys Leu Gin 
510 

Leu Glu Trp Asp 
525 

His Trp Arg He 

Asp Phe Cys Phe 
560 

He Leu Glu Thr 
575 

Ser Pro Met Glu 
590 

Ser Arg Arg Arg 
605 

Leu Lys Leu Val 

Tyr Arg Phe Arg 
64 0 

Phe Gly Leu Thr 
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* 645 650 655 



inr 


Asp 


Pro 


7\ vo 


xy x 


Val 


Ala 


Thr 


Ala 


Glu 


He 


Asn 


Trp 


Asn 


His 


Met 






660 










665 










670 






Ser* 


Asn 


T .oil 


Val 


Glu 


Lys 


His 


Val 


Phe 


Gly Arg 


Ser 


Glu 


Tyr 


Gin 


Gly 






675 










680 










685 








X lc 


r«l v 
\jj j. y 

o y u 


T .4^11 


He 


Asn 


Leu 


Asn 
695 


Asp 


Asn 


Glu 


He 


Asp 
700 


Arg 


Phe 


Lys 

- 


Glu 


Val 


X XIX 


T ."\/C! 

ny & 


Ser 


Asp 


Cvs 


Asp 


His 


Val 


Ala 


Leu 


His 


Leu 


Asp 


Tyr 


Ala 


7b 5 








710 










715 










720 


A J. a 


Lys 


Asn 


J- J- 1: 


Thr 


T i T"T*> 


Glu 


Ser 


Leu 


Tyr 


Pro 


Glu 


Trp 


He 


Asp 


Glu 








725 










730 










735 




Val 


Glu 




■p"h *=» 
rile 

740 




Val 


Pro 


Thr 


Cys 
745 


Pro 


Ser 


Leu 


Pro 


Leu 
750 


He 


Gin 


He 


Pro 


uiy 
755 


Lys 






He 


Asp 
760 


Leu 


Val 


He 


Ala 


Lys 
765 


Leu 


Pro 


Cys 


Asp 


Lys 
770 




ijxy 


Lys 




OCX 

775 


lira 


Asp 


Val 


Ala 


Arg 
780 


Leu 


His 


Leu 


Gin 


Leu 


Ala 


Ala 


>\jl a 


Arg 


V Ct X 




Ala 


Ser 


Ser 


Lys 


Gly Leu 


His 


Asn 


Val 


785 










Ton 










795 










800 


His 


Val 


He 


Leu 


Val 

one 

O VJ _> 


O « -v- 


Asp 


Cys 


Phe 


Pro 
810 


He 


Pro 


Asn 


Leu 


Phe 
815 


Thr 


Gly Gin 


Glu 


Leu 


v ax 


ai a 


Arg 


vjj _L i.1 


Gly Asn lie 


Trp 


Leu 




Lvs 


Pro 








oon 

O \J 










825 










830 






Asn 


Leu 


nl S 

D "3 tr 
O J 3 


bin 


Leu 


Arg 


Gin 


Lys 
840 


Leu 


Gin 


Leu 


Pro 


Val 
845 


Gly 


Ser 


Cys 


Glu 


Leu 
850 


Ser 


v d _L 


Pro 


Leu 


Gin 
855 


Ala 


Lys 


Asp 


Asn 


Phe 
860 


Tyr 


Ser 


Ala 


Gly 


Ala 


Lys 


Lys 


V3 J. u 


Ala 


x y j. 


Ala 


Thr 


He 


Leu 


His 


Ser 


Ala 


Gin 


Phe 


Tyr 


865 






870 










875 










880 


Val 


Cys 


Kj±y 


7\ "1 -a 
i-lX d 


He 


Ala 


Ala 


Ala 


Gin 


Ser 


He 


Arg 


Met 


Ser 


Gly 


Ser 






885 










890 










895 




Thr 


Arg 


7\ errs 


Leu 


Val 


He 


Leu 


Val 


Asp 


Glu 


Thr 


He 


Ser 


Glu 


Tyr 


His 






900 










905 










910 






Lys 


Ser 


Glv 


Leu 


Val 


Ala 


Ala 


Gly 


Trp 


Lys 


He 


Gin 


Met 


Phe 


Gin 


Arg 




915 










920 










925 








He 


Arg 


Asn 


Pro 


Asn 


Ala 


Val 


Pro 


Asn 


Ala 


Tyr 


Asn 


Glu 


Trp 


Asn 


Tyr 




930 










935 










940 




lie 






Ser 


Lys 


Phe 


Ara 


Leu 


Trp 


Gin 


Leu 


Thr 


Glu 


Tyr 


Ser 


Lys 


lie 


Phe 


945 








950 










955 










960 


He 


Asp 


Ala 


Asp 


Met 


Leu 


He 


Leu 


Arg 


Asn 


He 


Asp 


Phe 


Leu 


Phe 


Glu 






965 










970 










975 




Phe 


Pro 


Glu 


He 
980 


Ser 


Ala 


Thr 


X31y 


Asn 
985 


Asn 


Ala 


Thr 


Leu 


Phe 
990 


Asn 


Ser 


Gly 


Leu 


Met 


Val 


Val 


Glu 


Pro 


Ser 


Asn 


Ser 


Thr 


Phe 


Gin 


Leu 


Leu 


Met 




995 








1000 








1005 








Asp 


Asn 


He 


Asn 


Glu 


Val 


Val 


Ser 


Tyr 


Asn 


Gly 


Gly 


Asp 


Gin Gly 


Tyr 


1010 








1015 








1020 










Leu 


Asn 


Glu 


lie 


Phe 


Thr 


Trp 


Trp 


His 


Arg 


He 


Pro 


Lys 


His 


Met 


Asn 


1025 






1030 








1035 










1040 


Phe 


Leu 


Lys 


His 


Phe 


Trp 


Glu 


Gly 


Asp 


Glu 


Pro 


Glu 


• He 


Lys 


Lys 


Met 






1045 








1050 










1055 




Lys 


Thr 


Ser 


Leu 


Phe 


Gly 


Ala 


Asp 


Pro 


Pro 


He 


Leu 


Tyr 


Val 


Leu 


His 
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1060 1065 1070 

Tyr Leu Gly Tyr Asn Lys Pro Trp Leu Cys Phe Arg Asp Tyr Asp Cys 

10 75 1080 1085 

Asn Trp Asn Val Asp lie Phe Gin Glu Phe Ala Ser Asp Glu Ala His 

1090 1095 1100 

Lys Thr Trp Trp Arg Val His Asp Ala Met Pro Glu Asn Leu His Lys 
aios mo ins 1120 

Phe Cys Leu Leu Arg Ser Lys Gin Lys Ala Gin Leu Glu Trp Asp Arg 

1125 1130 1135 

Arg Gin Ala Glu Lys Gly Asn Tyr Lys Asp Gly His Trp Lys lie Lys 

1140 1145 1150 

He Lys Asp Lys Arg Leu Lys Thr Cys Phe Glu Asp Phe Cys Phe Trp 

1155 1160 1165 

Glu Ser Met Leu Trp His Trp Gly Glu Thr Asn Ser Thr Asn Asn Ser 

H70 1175 H80 

Ser Thr Thr Thr Thr Ser Ser Pro Pro His Lys Thr Ala Leu Pro Ser 
1185 H90 H95 1200 

Leu 



<210> 8 
<211> 1653 
<212> DNA 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (1) . - (1653) 

<400> 8 

atg ggg gcc aaa age aaa agt teg agt acg aga ttt ttt atg ttt tat 48 

Met Gly Ala Lys Ser Lys Ser Ser Ser Thr Arg Phe Phe Met Phe Tyr 
1 5 10 15 

ctt ata eta ata tea ttg teg ttt ttg ggt ttg etc tta aac ttt aaa 96 
Leu He Leu lie Ser Leu Ser Phe Leu Gly Leu Leu Leu Asn Phe Lys 
20 25 30 

cct ctg ttt ctg etc aac ccc atg ate get tct cct t<cg ata gtt gag 144 
Pro Leu Phe Leu Leu Asn Pro Met He Ala Ser Pro Ser He Val Glu 
35 40 45 

att cgt tat tct ttg ccg gaa ccg gtt aaa egg act ccg ata tgg etc 192 
He Arg Tyr Ser Leu Pro Glu Pro Val Lys Arg Thr Pro He Trp Leu 
50 55 60 

cga etc att aga aac tat ctt ccg gat gag aaa aag ate cga gtg ggt 240 
Arg Leu He Arg Asn Tyr Leu Pro Asp Glu Lys Lys He Arg Val Gly 



65 



70 75 80 



ctt etc aac ate gca gag aac gag cga gag age tac gag gca age ggg 



288 
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Leu Leu Asn He Ala Glu Asn Glu Arg Glu Ser Tyr Glu Ala Ser Gly" 
85 30 95 

acg teg ate ttg gag aat gtc cac gtg teg etc gat cct ctt ccg aac 336 
Thr Ser He Leu Glu Asn Val His Val Ser Leu Asp Pro Leu Pro Asn 
100 105 HO 

aat ctg aca tgg acg agt tta ttc ccg gtt tgg ate gac gag gat cac 3 84 
Asn Leu Thr Trp Thr Ser Leu Phe Pro Val Trp He Asp Glu Asp His 
115 120 125 

acg tgg cac att cct agt tgt cca gaa gtc cct etc cct aag atg gaa 432 
Thr Trp His He Pro Ser Cys Pro Glu Val Pro Leu Pro Lys Met Glu 
130 135 140 

ggt tec gaa get gac gtg gac gtc gtc gtt gtc aaa gtc ccg tgc gat 480 
Gly Ser Glu Ala Asp Val Asp Val Val Val Val Lys Val Pro Cys Asp 
145 150 155 160 

ggt ttc teg gag aag aga ggg tta aga gac gtt ttc agg eta cag gtg 52 8 
Gly Phe Ser Glu Lys Arg Gly Leu Arg Asp Val Phe Arg Leu Gin Val 
165 170 175 

aat ctg gcg gca gcg aat ctt gtg gtg gag agt ggt egg agg aat gtt 576 
Asn Leu Ala Ala Ala Asn Leu Val Val Glu Ser Gly Arg Arg Asn Val 
180 185 190 

gat egg act gtg tac gtt gtc ttc ate gga tct tgt ggg cct atg cat 
Asp Arg Thr Val Tyr Val Val Phe He Gly Ser Cys Gly Pro Met His 
195 200 205 

gag ate ttt agg tgt gat gag cgc gtg aag cgc gtg ggg gac tat tgg 
Glu He Phe Arg Cys Asp Glu Arg Val Lys Arg Val Gly Asp Tyr Trp 
210 215 220 

gtc tat agg cct gat ctt acg agg ttg aag cag aag ctt etc atg cct 
Val Tyr Arg Pro Asp Leu Thr Arg Leu Lys Gin Lys Leu, Leu Met Pro 
225 230 235 240 

cct ggt tea tgt cag att get ccg eta ggt caa gga gaa gca tgg ata 768 
Pro Gly Ser Cys Gin He Ala Pro Leu Gly Gin Gly Glu Ala Trp He 
245 250 255 



caa gac aag aac aga aat etc aca tec gaa aaa act aca tta tea tea 

Gin Asp Lys Asn Arg Asn Leu Thr Ser Glu Lys Thr Thr Leu Ser Ser 

260 265 270 

ttt act gec caa cgt gtc get tac gtg acg tta eta cac tea teg gag 

Phe Thr Ala Gin Arg Val Ala Tyr Val Thr Leu Leu His Ser Ser Glu 

275 280 285 

gta tac gta tgc gga gca ata gee tta gca caa age ata agg caa tct 



624 



672 



720 



816 



864 



912 
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Val Tyr Val Cys Gly Ala lie Ala Leu Ala Gin Ser lie Arg Gin Ser 
290 " * 295 300 

gga tea acc aag gac atg att etc etc cac gat gac tct ata ace aac 
Gly Ser Thr Lys Asp Met lie Leu Leu His Asp Asp Ser lie Thr Asn 
305 " 310 315 320 

ate tct etc att ggc eta age ctt get ggc tgg aaa eta egg cga gtg 
lie Ser Leu lie Gly Leu Ser Leu Ala Gly Trp Lys Leu Arg Arg Val 
325 330 335 

gag aga att cgt agt cct ttt tec aag aag cgt tct tac aat gag tgg 
Glu Arg lie Arg Ser Pro Phe Ser Lys Lys Arg Ser Tyr Asn Glu Trp 
340 345 350 

aac tac agt aag tta cgt gtg tgg caa gtg aca gat tac gac aaa eta 
Asn Tyr Ser Lys Leu Arg Val Trp Gin Val Thr Asp Tyr Asp Lys Leu 
355 360 365 



960 



1008 



1056 



1104 



gtg ttt ata gac gca gac ttc ate ate gtc aag aat att gat tac ctt 1152 
Val Phe He Asp Ala Asp Phe He He Val Lys Asn He Asp Tyr Leu 
370 375 380 

ttc tec tat cct caa ctt tct gec get ggc aat aac aaa gtc ttg ttc 1200 
Phe Ser Tyr Pro Gin Leu Ser Ala Ala Gly Asn Asn Lys Val Leu Phe 
385 ~ 390 395 400 

aac tea gga gtc atg gtt ctg gag cca tea get tgt tta ttc gag gat 1248 
Asn Ser Gly Val Met Val Leu Glu Pro Ser Ala Cys Leu Phe Glu Asp 
405 410 415 

ttg atg ctt aaa tea ttc aag ate ggg tea tac aac ggg gga gac caa 1296 
Leu Met Leu Lys Ser Phe Lys He Gly Ser Tyr Asn Gly Gly Asp Gin 
420 425 430 

gga ttt ctg aac gaa tat ttc gtg tgg tgg cat agg cat gat aaa gcg 
Gly Phe Leu Asn Glu Tyr Phe Val Trp Trp His Arg His Asp Lys Ala 
435 440 445 

cgc aat ctt cca gaa aat tta gag ggc ata cac tac ttg gga eta aaa 
Arg Asn Leu Pro Glu Asn Leu Glu Gly He His Tyr Leu Gly Leu Lys 
450 455 460 

cca tgg cga tgt tac aga gac tac gat tgt aac tgg gac ttg aaa acg 
Pro Trp Arg Cys Tyr Arg Asp Tyr Asp Cys Asn Trp Asp Leu Lys Thr 
465 470 475 480 

cga cgt gtg tat gca age gag teg gtg cat gcg aga tgg tgg aaa gtg 
Arg Arg Val Tyr Ala Ser Glu Ser Val His Ala Arg Trp Trp Lys Val 
4 85 4 90 4 95 

tac gac aag atg cct aag aag ctg aaa ggt tat tgt ggt ttg aat ctt 



1344 
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1440 



1488 



1536 
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Tyr Asp Lys Met Pro 
5 00 

aag atg gag aag aac 
Lys Met Glu Lys Asn 
515 

ggt ttt cct gaa aat 
Gly Phe Pro Glu Asn 
530 

aag aac cgt eta agt 
Lys Asn Arg Leu Ser 
545 



21 

Lys Lys Leu Lys Gly Tyr 
5 05 

gtt gag aag tgg agg aaa 
Val Glu Lys Trp Arg Lys 
520 

cat tgg aaa att aga ata 
His Trp Lys lie Arg lie 
535 

caa tga 

Gin 

550 



Cys Gly Leu Asn Leu 
510 

atg get aag etc aat 15 84 
Met Ala Lys Leu Asn 
525 

aaa gat cct agg aag 1632 

Lys Asp Pro Arg Lys 

540 

1653 
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Pro Gly Ser Cys 

Gin Asp Lys Asn 
260 

Phe Thr Ala Gin 
275 

Val Tyr Val Cys 
290 

Gly Ser Thr Lys 
305 

He Ser Leu He 

Glu Arg He Arg 
340 

Asn Tyr Ser Lys 
355 

Val Phe He Asp 

370 

Phe Ser Tyr Pro 
385 

Asn Ser Gly Val 

Leu Met Leu Lys 
420 

Gly Phe Leu Asn 
435 

Arg Asn Leu Pro 
450 

Pro Trp Arg Cys 
465 

Arg Arg Val Tyr 

Tyr Asp Lys Met 
500 

Lys Met Glu Lys 
515 

Gly Phe Pro Glu 
530 

Lys Asn Arg Leu 
545 
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Pro 


245 
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Thr 


Arg 


Val 


Ala 


Tyr 








280 
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440 


Glu 


Asn 


Leu 


Glu 
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520 
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Trp 
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535 




Ser 
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Leu Gly Gin Gly 
250 

Ser Glu Lys Thr 
265 

Val Thr Leu Leu 

Leu Ala Gin Ser 
3 00 

Leu His Asp Asp 
315 

Ala Gly Trp Lys 
330 

Lys Lys Arg Ser 
345 

Gin Val Thr Asp 

He Val Lys Asn 
380 

Ala Gly Asn Asn 
395 

Pro Ser Ala Cys 
410 

Gly Ser Tyr Asn 
425 

Trp Trp His Arg 

Gly He His Tyr 
460 

Asp Cys Asn Trp 
475 

Val His Ala Arg 
490 

Lys Gly Tyr Cys 
505 

Trp Arg Lys Met 

He Arg He Lys 
540 



Glu Ala Trp lie 
255 

Thr Leu Ser Ser 
270 

His Ser Ser Glu 
285 

He Arg Gin Ser 

Ser He Thr Asn 
320 

Leu Arg Arg Val 
335 

Tyr Asn Glu Trp 
350 

Tyr Asp Lys Leu 
365 

He Asp Tyr Leu 

Lys Val Leu Phe 
400 

Leu Phe Glu Asp 
415 

Gly Gly Asp Gin 
430 

His Asp Lys Ala 
445 

Leu Gly Leu Lys 

Asp Leu Lys Thr 
480 

Trp Trp Lys Val 
495 

Gly Leu Asn Leu 
510 

Ala Lys Leu Asn 
525 

Asp Pro Arg Lys 
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35. 



cca teg ccg caa gtc aac aaa aac ccg aaa tgg ctt cga etc ate cgt 
Pro Ser Pro Gin Val Asn Lys Asn Pro Lys Trp Leu Arg Leu He Arg 

55 60 



50 



aac tat eta cca gaa aaa aga ate caa gtc ggc ttc ctt aac ata gac 
Asn Tyr Leu Pro Glu Lvs Arg He Gin Val Gly Phe Leu Asn He Asp 

75 80 



65 



70 



gag aaa aag cgt gag age tac gag get cgt gga ccg ttg gta ctt aag 
Glu Lys Glu Arg Glu Ser Tyr Glu Ala Arg Gly Pro Leu Val Leu Lys 
85 90 95 

aac ate cac gtg ccg ctt gat cat ata ccc aag aat gtc act tgg aag 
Asn He His Val Pro Leu Asp His He Pro Lys Asn Val Thr Trp Lys 
100 105 HO 

agt ctt tac ccg gag tgg ate aac gag gaa get tct acc tgt ccg gag 
Ser Leu Tyr Pro Glu Trp He Asn Glu Glu Ala Ser Thr Cys Pro Glu 



115 



120 125 



ate cct etc cct cag cca gaa ggt tct gat get aac gtg gac gtt att 
He Pro Leu Pro Gin Pro Glu Gly Ser Asp Ala Asn Val Asp Val He 
130 135 140 



4 8 



96 



<400> 10 

atg ggg aca aaa acc cat aat tct aga ggg aaa ate ttc atg ate tat 
Met Gly Thr Lys Thr His Asn Ser Arg Gly Lys He Phe Met He Tyr 
■ ! 5 10 15 

eta ate eta gtc tea ttg tea ctt eta ggt ttg ate tta cct ttt aaa 
Leu He Leu Val Ser Leu Ser Leu Leu Gly Leu He Leu Pro Phe Lys 
20 25 30 

cct ctt ttc egg att act tct cca tct tea acg tta egg att gat ctt 144 
Pro Leu Phe Arg He Thr Ser Pro Ser Ser Thr Leu Arg He Asp Leu 

40 45 



192 



240 



288 



336 



384 



. 432 



gtt get aga gtt cca tgt gat ggt tgg teg gcg aat aaa ggg ctt agg 4 80 
val Ala Arg Val Pro Cys Asp Gly Trp Ser Ala Asn Lys Gly Leu Arg 
145 150 155 160 

gac gtt ttt agg ctt cag gtt aat ttg gee gca gcg aat eta gee gtc 
Asp Val Phe Arg Leu Gin Val Asn Leu Ala Ala Ala Asn Leu Ala Val 
165 170 175 

caa agt ggg ttg agg acg gtt aat cag gcg gtc tac gtt gta ttc ate 
Gin Ser Gly Leu Arg Thr Val Asn Gin Ala Val Tyr Val Val Phe lie 
180 185 190 

ggc tea tgt ggg cct atg cat gag att ttc ccg tgc gat gag cgc gtg 
Gly Ser Cys Gly Pro Met His Glu He Phe Pro Cys Asp Glu Arg Val 
195 ' 200 205 



528 



576 



624 
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atg cgc gtg gag gat tat tgg gtg tat aag cct tat etc cca agg ttg 
Met Arg Val Glu Asp Tyr Trp Val Tyr Lys Pro Tyr Leu Pro Arg Leu 

215 220 



210 



aag cag aag ctt etc atg cct gtt ggt tea tgt cag att get cct tea 
Lys Gin Lys Leu Leu Met Pro Val Gly Ser Cys Gin He Ala Pro Ser 

230 235 240 



225 

ttt get caa ttt ggt caa gaa gca tgg aga cca aaa cat gaa gat aat 

Phe Ala Gin Phe Gly Gin Glu Ala Trp Arg Pro Lys His Glu Asp Asn 
245 250 255 

ctt gca tea aag gca gtc aca gec tta ccc cgt cgc tta egg gtt gec 
Leu Ala Ser Lys Ala Val Thr Ala Leu Pro Arg Arg Leu Arg Val Ala 
260 265 270 

tac gtg aca gta eta cac teg tea gaa gec tat gtt tgt ggg gca ata 
Tyr Val Thr Val Leu His Ser Ser Glu Ala Tyr Val Cys Gly Ala He 
275 280 285 

get tta gcg caa agt ata aga caa tea gga teg cat aag gac atg att 
Ala Leu Ala Gin Ser He Arg Gin Ser Gly Ser His Lys Asp Met He 
290 295 300 

etc etc cat gat cat acc ata ace aac aag tct ctt att ggt etc age 
Leu Leu His Asp His Thr He Thr Asn Lys Ser Leu He Gly Leu Ser 
305 310 315 320 

get gcg gga tgg aat etc egg eta ate gac agg ate cgc agt cct ttt 
Ala Ala Gly Trp Asn Leu Arg Leu He Asp Arg He Arg Ser Pro Phe 
325 330 335 

teg caa aaa gac tct tat aat gag tgg aac tat age aaa tta cgt gtg 
Ser Gin Lys Asp Ser Tyr Asn Glu Trp Asn Tyr Ser Lys Leu Arg Val 
340 345 350 

tgg caa gta act gac tac gat aaa ctt gtg ttc ata gac gca gat ttc 
Trp Gin Val Thr Asp Tyr Asp Lys Leu Val Phe He Asp Ala Asp Phe 



355 



360 365 



ate ate etc aag aaa ctt gat cat etc ttc tac tat cca caa etc tea 
He He Leu Lys Lys Leu Asp His Leu Phe Tyr Tyr Pro Gin Leu Ser 
370 375 3 80 

get tea ggc aac gac aaa gtg tta ttc aac tec gga ate atg gtt etc 
Ala Ser Gly Asn Asp Lys Val Leu Phe Asn Ser Gly He Met Val Leu 

_ ~ ^ a n n 



385 



390 395 400 



gag cca teg gca tgt atg ttt aaa gat tta atg gag aaa tog ttc aag 
Glu Pro Ser Ala Cys Met Phe Lys Asp Leu Met Glu Lys Ser Phe Lys 
405 410 4 IS 
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816 



864 



912 



960 
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1056 



1104 
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1248 
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att gag tea tac aac gga gga gac caa gga ttc ctt aat gag ata ttt 
He Glu Ser Tyr Asn Gly Gly Asp Gin Gly Phe Leu Asn Glu He Phe 
420 425 430 



gac gaa aaa aat cat cga aga cac gat ctt cct gag aat gta gaa ggt 
Asp Glu Lys Asn His Arg Arg His Asp Leu Pro Glu Asn Val Glu Gly 



450 



455 460 



ctg cac tac ttg ggg ttg aaa cca tgg gta tgt tat aga gac tat gat 

Leu His Tyr Leu Gly Leu Lys Pro Trp Val Cys Tyr Arg Asp Tyr Asp 

465 470 475 480 

tgc'aat tgg gac att age gaa cga cgc gtg ttt gca age gat tct gtg 

Cys Asn Trp Asp He Ser Glu Arg Arg Val Phe Ala Ser Asp Ser Val 
485 490 495 

cac gaa aaa tgg tgg aaa gtg tat gac aaa atg tea gag cag ttg aaa 

His Glu Lys Trp Trp Lys Val Tyr Asp Lys Met Ser Glu Gin Leu Lys 



500 



505 510 



1296 



gta tgg tgg cac agg tta teg aaa cga gtg aac aca atg aag tac ttc 1344 
Val Trp Trp His Arg Leu Ser Lys Arg Val Asn Thr Met Lys Tyr Phe 
435 440 445 



ggt tat tgt ggt ttg aat aag aat atg gag aag agg att gag aag tgg 
Gly Tyr Cys Gly Leu Asn Lys Asn Met Glu Lys Arg He Glu Lys Trp 

520 525 



515 



aga aga ate get aag aac aat agt ttg cct gat agg cat tgg gag att 
Arg Arg He Ala Lys Asn Asn Ser Leu Pro Asp Arg His Trp Glu He 

53 5 54 0 



530 



gaa gtg aga gat cct agg aag acg aat ctt ctt gtt cag tga 
Glu Val Arg Asp Pro Arg Lys Thr Asn Leu Leu Val Gin 
545 550 555 



1392 



1440 



1488 



1536 



1584 



1632 



1674 
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Met Gly Thr Lys Thr His Asn Ser 

1 " 5 
Leu He Leu Val Ser Leu Ser Leu 
20 

Pro Leu Phe Arg He Thr Ser Pro 
35 40 
Pro Ser Pro Gin Val Asn Lys Asn 
50 55 



Arg Gly Lys He Phe Met He Tyr 

10 ~ 15 
Leu Gly Leu He Leu Pro Phe Lys 

25 3 0 

Ser Ser Thr Leu Arg He Asp Leu 

45 

Pro Lys Trp Leu Arg Leu He Arg 
60 
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Glu 


Lys 


Ser 


Phe 


Lys 










405 










410 










415 




He 


Glu 


Ser 


Tyr 


Asn 


Gly 


Gly 


Asp 


Gin 


Gly Phe 


Leu 


Asn 


Glu 


He 


Phe 








420 










425 










430 






Val 


Trp 


Trp 


His 


Arg 


Leu 


Ser 


Lys 


Arg 


Val 


Asn 


Thr 


Met 


Lys 


Tyr 


Phe 






.435 










440 










445 








Asp 


Glu 


Lys 


Asn 


His 


Ax-g 


Arg 


His 


Asp 


Leu 


Pro 


Glu 


Asn 


Val 


Glu 


Gly 




450 










455 










460 










Leu 


His 


Tyr 


Leu 


Gly 


Leu 


Lys 


Pro 


Trp 


Val 


Cys 


Tyr 


Arg 


Asp 


Tyr 


Asp 


465 










470 










475 










480 
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Cys 


Asn 


Trp 


Asp 


He 


Ser 


Glu 


Arg 


Arg 


Val 


Phe 


Ala 


Ser 


Asp 


Ser 


Val 








485 










490 










495 




His 


Glu 


Lys 


Trp 


Trp 


Lys 


Val 


Tyr 


Asp 


Lys 


Met 


Ser 


Glu 


Gin 


Leu 


Lys 






500 










505 










510 






Gly 


Tyr 


Cys 


Gly 


Leu 


Asn 


Lys 


Asn 


Met 


Glu 


Lys 


Arg 


He 


Glu 


Lys 


Trp 


515 










520 










525 








Arg 


Arg 


He 


Ala 


Lys 


Asn 


Asn 


Ser 


Leu 


Pro Asp 


Arg 


His 


Trp 


Glu 


He 


530 










535 










540 










Gilu 


Val 


Arg 


Asp 


Pro 


Arg 


Lys 


Thr 


Asn 


Leu 


Leu 


Val 


Gin 








5^5 










550 










555 













\ 



<210> 12 
<211> 1002 
<212> DNA 

<213> Arabidopsis thaliana 



<220> 

<221> CDS 

<222> (1) . . (1002) 



<400> 12 

atg gcc tta eta aat gaa tta atg agt ttt ttt ate caa aaa caa aaa 4 8 

Met Ala Leu Leu Asn Glu Leu Met Ser Phe Phe lie Gin Lys Gin Lys 
1 5 10 15 

gca ggt gta gac aaa gtg tat gac eta acg aag ata gaa gca gag aca 96 
Ala Gly Val Asp Lys Val Tyr Asp Leu Thr Lys He Glu Ala Glu Thr 
20 25 30 

aaa cga cca aaa cgt gaa gcc tac gta act gtt ctt cac tct tec gag 144 
Lys Arg Pro Lys Arg Glu Ala Tyr Val Thr Val Leu His Ser Ser Glu 
35 40 45 

tct tat gtc tgt ggt gcc ata act ttg get caa age etc ctt cag aca 192 
Ser Tyr Val Cys Gly Ala He Thr Leu Ala Gin Ser Leu Leu Gin Thr 
50 55 60 

aac acc aaa cgc gat ctt ate ctt etc cac gat gac tec ate tec att 240 
Asn Thr Lys Arg Asp Leu He Leu Leu His Asp Asp Ser He Ser He 
65 " 70 75 80 

acc aaa ctt cga get etc gcc gcc gca gga tgg aag ctt cgt egg ate 2 88 
Thr Lys Leu Arg Ala Leu Ala Ala Ala Gly Trp Lys Leu Arg Arg He 
85 90 95 

att cga ate aga aac cca ctt gcg gag aag gac teg tac aat gaa tac 3 36 
He Arg He Arg Asn Pro Leu Ala Glu Lys Asp Ser Tyr Asn Glu Tyr 
100 105 HO 



aac tac age aag ttt cga etc tgg caa ttg aca gat tac gac aaa gtg 



3 84 
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Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Lys Val 
115 120 125 

ate ttc att gat gec gac ate ate gtc tta cgt aac ctt gat ctt etc 
lie Phe He Asp Ala Asp He He Val Leu Arg Asn Leu Asp Leu Leu 
130 135 140 

ttc cat ttt cct cag atg teg gee acc gga aat gat gta tgg ata tat 
t>he His Phe Pro Gin Met Ser Ala Thr Gly Asn Asp Val Trp He Tyr 
l 45 150 155 160 

aat tea ggc ate atg gtc ate gag cct tct aat tgt acg ttt act aca 
Asn Ser Gly He Met Val He Glu Pro Ser Asn Cys Thr Phe Thr Thr 
165 170 175 

ate atg age cag cga age gag ate gtt tea tac aac ggt gga gat caa 
He Met Ser Gin Arg Ser Glu He Val Ser Tyr Asn Gly Gly Asp Gin 
180 185 190 

ggg tac eta aac gag ata ttt gtg tgg tgg cac cga ttg cct cga cga 
Gly Tyr Leu Asn Glu He Phe Val Trp Trp His Arg Leu Pro Arg Arg 
195 200 205 

gta aac ttt ctg aag aac ttc tgg teg aac aca acc aaa gaa aga aac 
Val- Asn Phe Leu Lys Asn Phe Trp Ser Asn Thr Thr Lys Glu Arg Asn 
210 215 220 

ate aag aac aac etc ttc gee gcg gag ccg cct cag gtc tac gcg gtc 
He Lys Asn Asn Leu Phe Ala Ala Glu Pro Pro Gin Val Tyr Ala Val 
225 230 . 235 240 

cac tac tta ggt tgg aaa cca tgg ctt tgc tat agg gac tac gat tgc 
His Tyr Leu Gly Trp Lys Pro Trp Leu Cys Tyr Arg Asp Tyr Asp Cys 
245 250 255 

aac tac gac gtg gac gag cag ttg gtg tac get agt gat gcg get cac 
Asn Tyr Asp Val Asp Glu Gin Leu Val Tyr Ala Ser Asp Ala Ala His 
260 265 270 

gtt agg tgg tgg aaa gtg cac gac tec atg gac gat gca ttg caa aag 
Val Arg Trp Trp Lys Val His Asp Ser Met Asp Asp Ala Leu Gin Lys 
275 " 280 285 

ttt tgc agg ctg acg aaa aag agg aga acg gag ate aac tgg gag agg 
Phe Cys Arg Leu Thr Lys Lys Arg Arg Thr Glu He Asn Trp Glu Arg 
290 295 300 

agg aaa gca agg ctt aga ggt tec act gat tat cat tgg aag ate aat 
Arg Lys Ala Arg Leu Arg Gly Ser Thr Asp Tyr His Trp Lys He Asn 
305 310 315 320 

gtc act gat cca aga cga cgt cgt tct tat ttg att ggt taa 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



960 



1002 
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Val Thr Asp Pro Arg Arg Arg Arg Ser Tyr Leu He Gly 
325 330 



<210> 13 
<211> 333 
<212> PRT 

<213> Arabidopsis thaliana 

Met°Ala 3 Leu Leu Asn Glu Leu Met Ser Phe Phe He Gin Lys Gin Lys 

1 5 10 15 

Ala Gly Val Asp Lys Val Tyr Asp Leu Thr Lys He Glu Ala Glu Thr 

20 25 30 

Lys Arg Pro Lys Arg Glu Ala Tyr Val Thr Val Leu His Ser Ser Glu 

35 40 45 

Ser Tyr Val Cys Gly Ala He Thr Leu Ala Gin Ser Leu Leu Gin Thr 

50 55 60 

Asn Thr Lys Arg Asp Leu He Leu Leu His Asp Asp Ser He Ser He 
65 70 75 80 

Thr Lys Leu Arg Ala Leu Ala Ala Ala Gly Trp Lys Leu Arg Arg He 

85 90 95 

He Arg He Arg Asn Pro Leu Ala Glu Lys Asp Ser Tyr Asn Glu Tyr 

100 105 HO 

Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Lys Val 

115 120 125 

He Phe He Asp Ala Asp He He Val Leu Arg Asn Leu Asp Leu Leu 

130 " 135 140 

Phe His Phe Pro Gin Met Ser Ala Thr Gly Asn Asp Val Trp He Tyr 

150 I 55 160 

Asn Ser Gly He Met Val He Glu Pro Ser Asn Cys Thr Phe Thr Thr 

165 170 175 

He Met Ser Gin Arg Ser Glu He Val Ser Tyr Asn Gly Gly Asp Gin 

180 " 185 190 

Glv Tyr Leu Asn Glu He Phe Val Trp Trp His Arg Leu Pro Arg Arg 

195 200 205 

Val Asn Phe Leu Lys Asn Phe Trp Ser Asn Thr Thr Lys Glu Arg Asn 

210 215 220 

He Lys Asn Asn Leu Phe Ala Ala Glu Pro Pro Gin Val Tyr Ala Val 
225 230 235 240 

His Tyr Leu Gly Trp Lys Pro Trp Leu Cys Tyr Arg Asp Tyr Asp Cys 

245 250 255 

Asn Tyr Asp Val Asp Glu Gin Leu Val Tyr Ala Ser Asp Ala Ala His 

260 265 270 

Val Arg Trp Trp Lys Val His Asp Ser Met Asp Asp Ala Leu Gin Lys 

275 280 285 

Phe Cys Arg Leu Thr Lys Lys Arg Arg Thr Glu He Asn Trp Glu Arg 

290 " 295 300 

Arg Lys Ala Arg Leu Arg Gly Ser Thr Asp Tyr His Trp Lys He Asn 

in r\ 320 
305 310 

Val Thr Asp Pro Arg Arg Arg Arg Ser Tyr Leu lie Gly 
325 330 
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<210> 14 

<211> 834 

<212> DNA 

<213> Arabidopsis 



tha liana 



<220> 

<221> CDS 

<222> (1) . - (834) 



<400> 14 

atg get cct tec aaa tct gca ctg ata cgc ttt aat eta gtc ttg ttg 

Met Ala Pro Ser Lys Ser Ala Leu He Arg Phe Asn Leu Val Leu Leu 

1 5 10 . 15 

gca gcg gag ctt cct ttg ttg gat get ctt ttc gtg aft gca etc cca 
Ala Ala Glu Leu Pro Leu Leu Asp Ala Leu Phe Val He Ala Leu Pro 
20 25 30 

aga eta ata gat ate ttt ata ctg eta tgt gat cag gtg gtg aga gga 
Arg Leu He Asp He Phe He Leu Leu Cys Asp Gin Val Val Arg Gly 
35 40 45 

gtg aag atg caa gaa etc gtt gaa gag aac gaa ata aac aag aaa gat 
Val Lys Met Gin Glu Leu Val Glu Glu Asn Glu He Asn Lys Lys Asp 
50 55 60 

ttg eta ace get agt aac cag aca aag ctg gag gcg cca age ttc atg 
Leu Leu Thr Ala Ser Asn Gin Thr Lys Leu Glu Ala Pro Ser Phe Met 
65 70 75 80 

gaa gag att tta aca aga ggg tta gga aaa aca aag ata ggg atg gtg 
Glu Glu He Leu Thr Arg Gly Leu Gly Lys Thr Lys He Gly Met Val 

85 90 95 

aac atg gaa gaa tgt gat ctt act aat tgg aaa cgt tat ggc gaa aeg 
Asn Met Glu Glu Cys Asp Leu Thr Asn Trp Lys Arg Tyr Gly Glu Thr 
100 105 HO 

gtt cac ata cat ttt gag cgt gtc teg aag etc ttc aaa tgg caa gac 
Val His He His Phe Glu Arg Val Ser Lys Leu Phe Lys Trp Gin Asp 
115 120 125 

ttg ttc ccc gag tgg ata gat gaa gag gaa gaa ace gag gtt ccc aca 
Leu Phe Pro Glu Trp He Asp Glu Glu Glu Glu Thr Glu Val Pro Thr 
130 135 140 

tgt cct gag ata cct atg ccc gat ttc gaa age tta gag aag ttg gat 
Cys Pro Glu He Pro Met Pro Asp Phe Glu Ser Leu Glu Lys Leu Asp 
145 150 155 160 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 
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ttg gta gta gtg aag ttg cct tgt aat tac cct gaa gaa ggg tgg aga 52 8 
Leu val val Val Lys Leu Pro Cys Asn Tyr Pro Glu Glu Gly Trp Arg 
165 170 175 

aga gag gtt ttg agg ttg caa gtg aac eta gtt gcg get aac ttg gca 576 
Arg Glu Val Leu Arg Leu Gin Val Asn Leu Val Ala Ala Asn Leu Ala 
. 180 185 190 

gec aag aaa ggg aag acg gat tgg aga tgg aaa age aaa gtg ttg ttt 624 
Ala Lys Lys Gly Lys Thr Asp Trp Arg Trp Lys Ser Lys Val Leu Phe 
195 200 205 

tgg age aaa tgt caa ccg atg att gag att ttc egg tgt gat gat ttg 672 
Trp Ser Lys Cys Gin Pro . Met He Glu He Phe Arg Cys Asp Asp Leu 



210 



215 220 



gag aag aga gag gca gat tgg tgg ctg tat cgc cct gag gtg gtt agg 720 
Glu Lys Arg Glu Ala Asp Trp Trp Leu Tyr Arg Pro Glu Val Val Arg 
225 230 235 240 

tta caa cag aga etc agt ttg cca gtc gga tct tgc aat ctt get ctt 768 
Leu Gin Gin Arg Leu Ser Leu Pro Val Gly Ser Cys Asn Leu Ala Leu 
245 250 255 

cct ttg tgg gca cca caa ggt aaa att act ttc atg caa att aat ctt 816 
Pro Leu Trp Ala Pro Gin Gly Lys He Thr Phe Met Gin He Asn Leu 
260 265 270 

834 

ctt get aaa tat ttt tag 
Leu Ala Lys Tyr Phe 
275 



<210> 15 

<211> 277 

<212> PRT 

<213> Arabidopsis 



thai i ana 



<400> 15 




























Leu 


Met 


Ala 


Pro 


Ser 


Lys 


Ser 


Ala 


Leu 


He 


Arg 


Phe 


Asn 


Leu 


Val 


Leu 


1 








5 










10 










15 




Ala 


Ala 


Glu 


Leu 


Pro 


Leu 


Leu 


Asp 


Ala 


Leu 


Phe 


Val 


lie 


Ala 


Leu 


Pro 






20 










25 










30 




Gly 


Arg 


Leu 


He 


Asp 


He 


Phe 


He 


Leu 


Leu 


Cys 


Asp 


Gin 


Val 


Val 


Arg 




35 










40 










45 








Val 


Lys 


Met 


Gin 


Glu 


Leu 


Val 


Glu 


Glu 


Asn 


Glu 


He 


Asn 


Lys 


Lys 


Asp 




50 










55 










60 






Phe 


Met 


Leu 


Leu 


Thr 


Ala 


Ser 


Asn 


Gin 


Thr 


Lys 


Leu 


Glu 


Ala 


Pro 


Ser 


65 










70 










75 










80 


Glu 


Glu 


He 


Leu 


Thr 


Arg 


Gly 


Leu 


Gly Lys 


Thr 


Lys 


He 


Gly 


Met 


Val 








85 










90 










95 
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Asn Met Glu Glu 
100 

Val His lie His 
115 

Leu Phe Pro Glu 
130 

Cys Pro Glu lie 
145 

Leu Val Val Val 

Arg Glu Val Leu 
180 

Ala Lys Lys Gly 
195 

Trp Ser Lys Cys 
210 

Glu Lys Arg Glu 
225 

Leu Gin Gin Arg 

Pro Leu Trp Ala 
260 

Leu Ala Lys Tyr 
275 



Cys Asp Leu Thr 

Phe Glu Arg Val 
120 

Trp lie Asp Glu 
135 

Pro Met Pro Asp 
150 

Lys Leu Pro Cys 
165 

Arg Leu Gin Val 

Lys Thr Asp Trp 
200 

Gin Pro Met lie 
215 

Ala Asp Trp Trp 
230 

Leu Ser Leu Pro 
245 

Pro Gin Gly Lys 
Phe 



32 

Asn Trp Lys Arg 
105 

Ser Lys Leu Phe 

Glu Glu Glu Thr 
140 

Phe Glu Ser Leu 
155 

Asn Tyr Pro Glu 
170 

Asn Leu Val Ala 
185 

Arg Trp Lys Ser 

Glu lie Phe Arg 
220 

Leu Tyr Arg Pro 
235 

Val Gly Ser Cys 
250 

lie Thr Phe Met 
265 



Tyr Gly Glu Thr 
110 

Lys Trp Gin Asp 
125 

Glu Val Pro Thr 

Glu Lys Leu Asp 
160 

Glu Gly Trp Arg 
175 

Ala Asn Leu Ala 
190 

Lys Val Leu Phe 
205 

Cys Asp Asp Leu 

Glu Val Val Arg 
240 

Asn Leu Ala Leu 
255 

Gin lie Asn Leu 
270 



<210> 16 
<211> 383 
<212> DNA 

<213> Hordeum vulgare 

<22,0> 

<221> CDS 

<222> (46) . . (381) 

<400> 16 

ttqaatctgc gggttggaag gtcagaataa ttgagaggat cggaa ccc gaa gcc gag 57 

Pro Glu Ala Glu 
1 

cgt gat get tac aat gag tgg aac tac age aag ttc egg ttg tgg cag 105 
Arq Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu Trp Gin 
5 10 15 20 



etc acg gac tat gac aag ate ata ttc ata gat get gat. ctg etc ate 

Leu Thr Asp Tyr Asp Lys lie lie Phe lie Asp Ala Asp Leu Leu lie 
25 3 0 35 

ttg agg aac att gat ttc ctg ttt aca atg cca gaa ate agt gca ace 

Leu Arg Asn lie Asp Phe Leu Phe Thr Met Pro Glu lie Ser Ala Thr 
40 45 50 



153 



201 
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ggc aac aat gca aca etc ttc aac tct ggt gtc atg gtc ate gaa ccc 24 9 

Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly Val Met Val lie Glu Pro 
55 60 65 

tea aac tgc aca ttc cag ctg tta atg gag cac ate aat gag ata aca 297 
Ser Asn Cys Thr Phe Gin Leu Leu Met Glu His lie Asn Glu lie Thr 
i 70 75 80 

tct tac aat ggt ggt gat cag ggc tac ttg aat gag ata ttc aca tgg 345 
Ser Tyr Asn Gly Gly Asp Gin Gly Tyr Leu Asn Glu He Phe Thr Trp 
85 * 90 9S 100 

tgg cat egg att ccc aag cac atg aac ttc ctg aag ca 383 
Trp His Arg He Pro Lys His Met Asn Phe Leu Lys 
105 HO 



<210> 17 
<211> 112 
<212> PRT 

<213> Hordeum vulgare 



<400> 17 
Pro Glu Ala Glu 
1 

Arg Leu Trp Gin 
20 

Asp Leu Leu He 
35 

He Ser ' Ala Thr 
50 

Val He Glu Pro 
65 

Asn Glu He Thr 



He Phe Thr Trp 
100 



Arg Asp Ala Tyr 
5 

Leu Thr Asp Tyr 



Leu Arg Asn He 
40 

Gly Asn Asn Ala 
55 

Ser Asn Cys Thr 
70 

Ser Tyr Asn Gly 
85 

Trp His Arg He 



Asn Glu Trp Asn 
10 

Asp, Lys He He 
25 

Asp Phe Leu Phe 



Thr Leu Phe Asn 
60 

Phe Gin Leu Leu 
75 

Gly Asp Gin Gly 
90 

Pro Lys His Met 
105 



Tyr Ser Lys Phe 
15 

Phe He Asp Ala 
30 

Thr Met Pro Glu 
45 

Ser Gly Val Met 



Met Glu His He 
80 

Tyr Leu Asn Glu 
95 

Asn Phe Leu Lys 
110 



<210> 18 
<211> 245 
<212> DNA 

<213> Hordeum vulgare 
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<220> 

<221> CDS 

<222> (52) . - (243) 



cgagcttgaa tctgcgggtt ggcaagtcag aataattgag aggatccgga a ccc gaa 57 



1 



qcc gag cgt gat get tac aat gag tgg aac tac age aag ttc egg ttg 105 
Ala Glu Arg Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu 

10 15 



5 



tgg cag etc acg gac tat gac aag ate ata ttc ata gat get gat ctg 
Trp Gin Leu Thr Asp Tyr Asp Lys He He Phe He Asp Ala Asp Leu 

25 30 



20 



etc ate ttg agg aac att gat ttc ctg ttt aca atg cca gaa ate agt 
Leu lie Leu Arg Asn He Asp Phe Leu Phe Thr Met Pro Glu He Ser 

45 50 



153 



201 



35 



40 



qca aac ggc aac aat gca aca etc ttc aac tct ggt gtc atg gt 245 
Ala Asn Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly Val Met 

55 60 



<210> 19 
<211> 64 
<212> PRT 
<213> Hordeum 



vulgare 



Pro°Glu 9 Ala Glu Arg Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe 
x 5 10 15 

Ara Leu Trp Gin Leu Thr Asp Tyr Asp Lys He He Phe He Asp Ala 
20 25 30 

Asp Leu Leu He Leu Arg Asn He Asp Phe Leu Phe Thr Met Pro Glu 
35 40 45 

He Ser Ala Asn Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly Val Met 
50 55 60 



<210> 20 
<211> 1284 
<212> DNA 

<213> Triticum aestivum 



<220> 
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<221> CDS 

<222> (1) • • (1284) 



acg cgt ccg etc gec ttc ttc ttc etc gtt eta cat ggc cct cct get 48 
Thr Arg Pro Leu Ala Phe Phe Phe Leu Val Leu His Gly Pro Pro Ala 
1 5 10 15 

cca ccc caa gta etc cca cat cct cga ccg egg cgc etc etc tct ggt 96 
Pro Pro Gin Val Leu Pro His Pro Arg Pro Arg Arg Leu Leu Ser Gly 



20 25 30 

ccg ctg cac ctt ccg cga cgc ctg ccc gtc cac gtc cca cct etc acg 
Pro Leu His Leu Pro Arg Arg Leu Pro Val His Val Pro Pro Leu Thr 



35 



40 45 



gaa ggt aag ccg gga gga aga tea gtg gcg gcg gcg aac aag gtg gtg 
Glu Glv Lvs Pro Gly Gly Arg Ser Val Ala Ala Ala Asn Lys Val Val 
" 55 60 



50 



acQ acq gag egg ate gtg aac gcg ggg cgc gcg ccg acc atg ttc aac 
111 Thr Glu Arg He Val Asn Ala Gly Arg Ala Pro Thr Met Phe Asn 

70 75 80 



65 



gag ctg cgc ggc egg ctg egg atg ggc ctg gtg aac ate ggc cgc gac 
Glu Leu Arg Gly Arg Leu Arg Met Gly Leu Val Asn He Gly Arg Asp 
85 90 95 

gag ctg ctg gcg ctg ggc gtg gag gga gac gec gtg ggc gtg gac ttc 
111 Leu Leu Ala Leu Gly Val Glu Gly Asp Ala Val Gly Val Asp Phe 
100 105 HO 

t 

gac cgc gtg teg gac gtg ttc egg tgg tea gac ctg ttc ccg gag tgg 

Asp Arg Val Ser Asp Val Phe Arg Trp Ser Asp Leu Phe Pro Glu Trp 
la5 120 125 

ate gac gag gag gag gag gac ggc gtc ccc tec tgc ccg gag ate ccc 
lie Asp Glu Glu Glu Glu Asp Gly Val Pro Ser Cys Pro Glu He Pro 
130 135 140 

atg ccg gac ttc tec egg tac gac gac gac ggc gtg gac gtg gtggtg 
Met Pro Asp Phe Ser Arg Tyr Asp Asp Asp Gly Val Asp Val Val Val 



145 



150 155 160 



gcg gcg ctg ccg tgc aac egg acg gcg gtc egg ggg tgg aac cgc gac 

111 Ala Leu Pro Cys Asn Arg Thr Ala Val Arg Gly Trp Asn Arg Asp 

165 ... 170 175 

gtg ttc agg ctg cag gtg cac ctg gtg gcg gcg cac atg gcg gcg egg 

Val Phe Arg Leu Gin Val His Leu Val Ala Ala His Met Ala Ala Arg 
180 185 190 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



WO 03/014365 



PCT/GB02/03636 



36 

aag tgg gcg gcg cga egg cgc egg ccg ggt gcg cgt -ggt get gcg gag 624 

Lys Trp Ala Ala Arg Arg Arg Arg Pro Gly Ala Arg Gly Ala Ala Glu 
195 200 205 

cga gtg cga gec gat gat gga cct gtt ccg gtg cga cga gtc cgt ggg 672 

Arg Val Arg Ala Asp Asp Gly Pro Val Pro Val Arg Arg Val Arg Gly 
210 215 220 

gcg gga ggg gga ctg gtg gat gta cag cgt cga cgc gec gcg cat gga 720 

Ala Gly Gly Gly Leu Val Asp Val Gin Arg Arg Arg Ala Ala His Gly 
225 230 235 240 

gga gaa get ccg get gee cat egg etc ctg caa cct cgc cgc tgc cgc 768 

Gly Glu Ala Pro Ala Ala His Arg Leu Leu Gin Pro Arg Arg Cys Arg 
245 250 255 

tct ggg ggc caa cag gca tec acg agg tgt tea acg egt cag acc taa 816 

Ser Gly Gly Gin Gin Ala Ser Thr Arg Cys Ser Thr Arg Gin Thr 

260 265 270 

cag egg tgg acg ccg gca gee age ggc gcg agg cgt acg cga ctg gtg 864 

Gin Arg Trp Thr Pro Ala Ala Ser Gly Ala Arg Arg Thr Arg lieu Val 
275 280 285 

ctg cac teg tec gac cga tac ctg tgc ggc gee ate gtg ctg gcg cag 912 

Leu His Ser Ser Asp Arg Tyr Leu Cys Gly Ala lie Val Leu Ala Gin 
290 295 300 

age ate egg egg teg ggc tec acc cgc gac atg gtc etc etc cac gac 960 

Ser He Arg Arg Ser Gly Ser Thr Arg Asp Met Val Leu Leu His Asp 
305 310 315 320 

cac acc gtc tec aag ccg gee etc cgc gcg ctg gtc gee gee ggc tgg 1008 

His Thr Val Ser Lys Pro Ala Leu Arg Ala Leu Val Ala Ala Gly Trp 
325 330 335 

ate ccg cgc agg ate egg cgc ate cgc aac ccg cgc gcg gag egg ggc 105 6 

He Pro Arg Arg lie Arg Arg He Arg Asn Pro Arg Ala Glu Arg Gly 

340 345 350 

tec tac aac gag tac aac tac age aag ttc egg ctg tgg cag ctg acg 1104 

Ser Tyr Asn Glu Tyr Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu Thr 
355 360 365 

gag tac ttc cgc gtc gtc ttc ate gac gec gac ate etc gtc etc cgc 1152 

Glu Tyr Phe Arg Val Val Phe He Asp Ala Asp He Leu Val Leu Arg 
370 375 380 

tec etc gac gcg etc ttc cgc ttc ccg cag ate tec' gec ggg ggc aac 1200 

Ser Leu Asp Ala Leu Phe Arg Phe Pro Gin He Ser Ala Gly Gly Asn 
385 390 395 400 
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gac ggc tec etc ttc aac teg ggg aac atg gtg etc gag ccg teg gcg 1248 

Asp Gly Ser Leu Phe Asn Ser Gly Asn Met Val Leu Glu Pro Ser Ala 
405 410 415 



tgc acc ttc gag gcg etc gtc egg ggg egg cgc aca 12 84 

Cys Thr Phe Glu Ala Leu Val Arg Gly Arg Arg Thr 
42 0 4 25 



<210> 21 
<211> 271 
<212> PRT 

<213> Triticum aestivum 



<400> 21 
Thr Arg Pro Leu 
1 

Pro Pro Gin Val 
20 

Pro Leu His Leu 
35 

Glu Gly Lys Pro 
50 

Ala Thr Glu Arg 
65 

Glu Leu Arg Gly 

Glu Leu Leu Ala 
100 

Asp Arg Val Ser 
115 

lie Asp Glu Glu 

13.0 

Met Pro Asp Phe 
145 

Ala Ala Leu Pro 

Val Phe Arg Leu 
180 

Lys Trp Ala Ala 
195 

Arg Val Arg Ala 
210 

Ala Gly Gly Gly 
225 

Gly Glu Ala Pro 

Ser Gly Gly Gin 
260 



Ala 


Phe 


Phe 


Pne 


o 








Leu 


Pro 


His 


Pro 


Pro 


Arg 


Arg 


Leu 










Gly 




Axg 


Ser 






55 




lie 


Val 


Asn 


Ala 




70 






Arg 


Leu 


Arg 


Met 


85 








Leu 


Gly 


Val 


Glu 


Asp 


Val 


Phe 


Arg 








120 


Glu 


Glu 


Asp 


Gly 






135 




Ser 


Arg 


Tyr 


Asp 




150 






Cys 


Asn 


Arg 


Thr 


165 








Gin 


Val 


His 


Leu 


Arg 


Arg 


Arg 


Arg 








200 


Asp 


Asp 


Gly 


Pro 






215 




Leu 


Val 


Asp 


Val 




230 






Ala 


Ala 


His 


Arg 


245 








Gin 


Ala 


Ser 


Thr 



Leu Val Leu His 
10 

Arg Pro Arg Arg 
25 

Pro Val His Val 

Val Ala Ala Ala 
60 

Gly Arg Ala Pro 
75 

Gly Leu Val Asn 
90 

Gly Asp Ala Val 
105 

Trp Ser Asp Leu 

Val Pro Ser Cys 
140 

Asp Asp Gly Val 
155 

Ala Val Arg Gly 
170 

Val Ala Ala His 
185 

Pro Gly Ala Arg 

Val Pro Val Arg 
220 

Gin Arg Arg Arg 
235 

Leu Leu Gin Pro 
250 

Arg Cys Ser Thr 
2 65 



Gly Pro Pro Ala 
15 

Leu Leu Ser Gly 
30 

Pro Pro Leu Thr 
45 

Asn Lys Val Val 

Thr Met Phe Asn 
80 

He Gly Arg Asp 
95 

Gly Val Asp Phe 
110 

Phe Pro Glu Trp 
125 

Pro Glu He Pro 

Asp Val Val Val 
160 

Trp Asn Arg Asp 
175 

Met Ala Ala Arg 
190 

Gly. Ala Ala Glu 
205 

Arg Val Arg Gly 

Ala Ala His Gly 
240 

Arg Arg Cys Arg 
255 

Arg Gin Thr 
270 



<210> 22 
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<211> 156 
<212> PRT 

<213> Triticum aestivum 



<400> 22 














ri'] n 

V3 JL 11 




Trp 


x nr 


Pro 

5 


Ala 


Ala 


Ser 


1 

LeU 


Hi s 




Ser 


Asp 


Arq 


Tyr 


Leu 


Ser 


He 


Arg 


J. u 
Arq 


Ser 


Gly 


Ser 


Thr 






35 










40 


His 


Thr 


val 


Ser 


Lys 


Pro 


Ala 


Leu 




50 










55 




He 


Pro 


Arg 


Arg 


He 


Arg 


Arg 


He 


65 










70 






Ser 


Tyr 


Asn 


Glu 


Tyr 


Asn 


Tyr 


Ser 










85 








Glu 


Tyr 


Phe 


Arg 


Val 


Val 


Phe 


He 








100 










Ser 


Leu 


Asp 


Ala 


Leu 


Phe 


Arg 


Phe 






115 










120 


Asp 


Gly 


Ser 


Leu 


Phe 


Asn 


Ser 


Gly 




130 










135 




Cys 


Thr 


Phe 


Glu 


Ala 


Leu 


Val 


Arg 



145 150 



Gly Ala Arg Arg Thr Arg Leu Val 

- 10 15 
Cys Gly Ala He Val Leu Ala Gin 

25 ' 30 
Arg Asp Met Val Leu Leu His Asp 

4 5 

Arg Ala Leu Val Ala Ala Gly Trp 
60 

Arg Asn Pro Arg Ala Glu Arg <3ly 
75 80 
Lys Phe Arg Leu Trp Gin Leu Thr 

90 95 
Asp Ala Asp He Leu Val Leu Arg 
105 HO 
Pro Gin He Ser Ala Gly Gly Asn 
125 

Asn Met Val Leu Glu Pro Ser Ala 
14 0 

Gly Arg Arg Thr 
155 



<210> 23 
<211> 2028 
<212> DNA 

<213> Arabidopsis thaliana 



<220> 

<221> CDS 

<222> (1) . . (1854) 

<400> 23 ^ 
atg ata cct tec tea agt ccc atg gag tea aga cat cga etc teg tt<: 4 8 
Met He Pro Ser Ser Ser Pro Met Glu Ser Arg His Arg Leu Ser Phe 



1 5 



10 15 



tea aat gag aag aca agt agg agg aga ttt caa aga att gag aag ggt 96 
Ser Asn Glu Lys Thr Ser Arg Arg Arg Phe Gin Arg He Glu Lys Gly 
20 25 30 

gtc aag ttc aac act ctg aaa ctt gtg ttg att tgt ata atg ctt gga 144 
Val Lys- Phe Asn Thr Leu Lys Leu Val Leu He Cys He Met Leu -Gly 



35 



40 45 



get ttg ttc acg ate tac cgt ttt cgt tat cca ccg eta caa att cct 192 
Ala Leu Phe Thr He Tyr Arg Phe Arg Tyr Pro Pro Leu Gin He Pro 
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50 55 60 

gaa att cca act agt ttt ggt ctt act act gat cct cgc tat gta get 
Glu He Pro Thr Ser Phe Gly Leu Thr Thr Asp Pro Arg Tyr Val Ala 
65 70 75 80 

aca get gag ate aac tgg aac cat atg tea aat ctt gtt gag aag cac 
Thr Ala Glu He Asn Trp Asn His Met Ser Asn Leu Val Glu Lys His 
85 90 95 . 

gta ttt ggt aga age gag tat caa gga att ggt ctt ata aat ctt aac 
Val Phe Gly Arg Ser Glu Tyr Gin Gly lie Gly Leu He Asn Leu Asn 
100 105 HO 

qat aac gag att gat cga ttc aag gag gta acg aaa tct gac tgt gat 
Asp Asn Glu He Asp Arg Phe Lys Glu Val Thr Lys Ser Asp Cys Asp 
H5 120 125 

cat gta get ttg cat eta gat tat get gca aag aac ata aca tgg gaa 
His Val Ala Leu His Leu Asp Tyr Ala Ala Lys Asn He Thr Trp Glu 
130 135 140 

tct tta tac ccg gaa tgg att gat gaa gtt gaa gaa ttc gaa gtc cct 
Ser Leu Tyr Pro Glu Trp He Asp Glu Val Glu Glu Phe Glu Val Pro 

155 160 



145 



150 



act tgt cct tct ctg cct ttg att caa att cct ggc aag cct egg att 
Thr Cvs Pro Ser Leu Pro Leu He Gin He Pro Gly Lys Pro Arg He 

170 175 



165 



gat ctt gta att gee aag ctt ccg tgt gat aaa tea gga aaa tgg tct 
Asp Leu val He Ala Lys Leu Pro Cys Asp Lys Ser Gly Lys Trp Ser 

185 190 



180 



aga gat gtg get cgc ttg cat tta caa ctt gca gca get cga gtg gcg 
Arg Asp Val Ala Arg Leu His Leu Gin Leu Ala Ala Ala Arg Val Ala 

200 205 



195 



get tct tct aaa gga ctt cat aat gtt cat gtg att ttg gta tct gat 
Ala Ser Ser Lys Gly Leu His Asn Val His Val He Leu Val Ser Asp 

215 220 



210 



tgc ttt cca ata ccg aat ctt ttt acg ggt caa gaa ctt gtt gee cgt 
Cys Phe Pro He Pro Asn Leu Phe Thr Gly Gin Glu Leu Val Ala Arg 

235 240 



225 



230 



caa gga aac ata tgg ctg tat aag cct aat ctt cac cag eta aga caa 
Gin Gly Asn He Trp Leu Tyr Lys Pro Asn Leu His Gin Leu Arg Gin 

250 255 



245 



aag tta cag ctt cct gtt ggt tec tgt gaa ctt tct gtt cct ctt caa 
Lys Leu Gin Leu Pro Val Gly Ser Cys Glu Leu Ser Val Pro Leu Gin 
260 265 270 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 
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get aaa gat aat ttc tac tec gca ggt gca aag aaa gaa get tac geg 864 

Ala Lys Asp Asn Phe Tyr Ser Ala Gly Ala Lys Lys Glu Ala Tyr Ala 
275 280 285 

act ate ttg cat tct gee caa ttt tat gtc tgt gga gee att gca get 912 
Thr lie Leu His Ser Ala Gin Phe Tyr Val Cys Gly Ala lie Ala Ala 
290 295 300 

gca cag age att cga atg tea ggc tct act cgt gat ctg gtc ata ctt 960 
Ala Gin Ser lie Arg Met Ser Gly Ser Thr Arg Asp Leu Val lie Leu 
305 310 315 320 

gtt gat gaa acg ata age gaa tac cat aaa agt ggc ttg gta get get 1008 
Val Asp Glu Thr lie Ser Glu Tyr His Lys Ser Gly Leu Val Ala Ala 
325 330 335 

gga tgg aag att caa atg ttt caa aga ate agg aac ccg aat get gta 1056 
Gly Trp Lys lie Gin Met Phe Gin Arg lie Arg Asn Pro Asn Ala Val 
340 345 350 

cca aat gec tac aac gaa tgg aac tac age aag ttt cgt ctt tgg caa 1104 
Pro Asn Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu Trp Gin 
355 360 365 

ctg act gaa tac agt aag ate ate ttc ate gat gca gac atg ctt ate 1152 
Leu Thr Glu Tyr Ser Lys lie lie Phe lie Asp Ala Asp Met Leu lie 
370 375 380 

ctg aga aac att gat ttc etc ttc gag ttc cet gag ata tea gca act 1200 
Leu Arg Asn lie Asp Phe Leu Phe Glu Phe Pro Glu lie Ser Ala Thr 
385 390 395 400 

gga aac aat get acg etc ttc aac tct ggt eta atg gtg gtt gag cca 1248 
Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly Leu Met Val Val Glu Pro 
405 410 415 

tct aat tea aca ttc cag tta eta atg gat aac att aat gaa gtt gtg 1296 
Ser Asn Ser Thr Phe Gin Leu Leu Met Asp Asn He Asn Glu Val Val 
420 425 430 

tct tac aac gga gga gac caa ggt tac ctt aac gag ata ttc aca tgg 1344 
Ser Tyr Asn Gly Gly Asp Gin Gly Tyr Leu Asn Glu He Phe Thr Trp 
435 440 445 

tgg cat egg att cca aaa cac atg aat ttc ttg aag cat ttc tgg gaa 1392 
Trp His Arg He Pro Lys His Met Asn Phe Leu Lys His Phe Trp Glu 
450 455 460 

gga gac gaa cet gag att aaa aaa atg aag acg agt eta ttt gga get 1440 
Gly Asp Glu Pro Glu lie Lys Lys Met Lys Thr Ser Leu Phe Gly Ala 
465 470 475 480 
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gat cct ccg ate eta tac gtt ctt cat tac eta ggt tat aac aaa ccc 
Asp Pro Pro lie Leu Tyr Val Leu His Tyr Leu Gly Tyr Asn Lys Pro 
485 490 495 

tgg tta tgc ttc aga gac tat gac tgc aat tgg aat gtc gat att ttc 
Trp Leu Cys Phe Arg Asp Tyr Asp Cys Asn Trp Asn Val Asp lie Phe 
. 500 505 510 

cag gaa ttt get agt gac gag get cat aaa acc tgg tgg aga gtg cac 
Gin Glu Phe Ala Ser Asp Glu Ala His Lys Thr Trp Trp Arg Val His 
515 520 525 



1488 



1536 



1584 



1632 



gac gca atg cct gaa aac ttg cat aag ttc tgt eta eta aga teg aaa 
Asp Ala Met Pro Glu Asn Leu His Lys Phe Cys Leu Leu Arg Ser Lys 
530 535 540 

cag aag gcg caa ctt gaa tgg gat agg aga caa gca gag aaa ggg aac 
Gin Lys Ala Gin Leu Glu Trp Asp Arg Arg Gin Ala Glu Lys Gly Asn 
545 550 555 560 

tac aaa gat gga cat tgg aag ata aag ate aaa gac aag aga ctt aag 
Tyr Lys Asp Gly His Trp Lys lie Lys He Lys Asp Lys Arg Leu Lys 
565 570 575 

act tgt ttc gaa gat ttc tgc ttt tgg gag agt atg ctt tgg cat tgg 
Thr Cys Phe Glu Asp Phe Cys Phe Trp Glu Ser Met Leu Trp His Trp 
580 585 590 

ggt gag acg aac tct acc aac aat tct tec acc acc acc act tea tea 
Gly Glu Thr Asn Ser Thr Asn Asn Ser Ser Thr Thr Thr Thr Ser Ser 
595 600 605 

ccg ccg cat aaa acc get etc cct tec ctg tgaattcttt tggctttctg 
Pro Pro His Lys' Thr Ala Leu Pro Ser Leu 
610 615 

gtttggtaca aattactctg cctttcgcca accaaatgtg ggttggatat gttcttttgt 1934 
ttttttatta tcagcttgaa acctgtatac gaatcccaga aacaatgtaa tcatgagggg 1994 

2 02 8 

ataaaggaat gaaagacaaa taaagaattt acag 

<210> 24 
<211> 618 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 24 

Met He Pro Ser Ser Ser Pro Met Glu Ser Arg His Arg Leu Ser Phe 
i 5 10 15 



1680 



1728 



1776 



1824 



1874 
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Ser Asn Glu Lys Thr Ser Arg Arg Arg Phe Gin Arg He Glu Lys Gly 
20 25 30 

Val Lys Phe Asn Thr Leu Lys Leu Val Leu He Cys He Met Leu Gly 
35 40 45 

Ala Leu Phe Thr He Tyr Arg Phe Arg Tyr Pro Pro Leu Gin He Pro 
50 55 60 

Glu He Pro Thr Ser Phe Gly Leu Thr Thr Asp Pro Arg Tyr Val Ala 
65 70 75 80 

Thr Ala Glu He Asn Trp Asn His Met Ser Asn Leu Val Glu Lys His 
85 90 " 95 

Val Phe Gly Arg Ser Glu Tyr Gin Gly He Gly Leu He Asn Leu Asn 
100 105 HO 

Asp Asn Glu He Asp Arg Phe Lys Glu Val Thr Lys Ser Asp cys Asp 
115 120 125 

His Val Ala Leu His Leu Asp Tyr Ala Ala Lys Asn He Thr Trp Glu 
130 135 140 

Ser Leu Tyr Pro Glu Trp He Asp Glu Val Glu Glu Phe Glu Val Pro 
145 ' 150 155 160 

Thr Cys Pro Ser Leu Pro Leu He Gin He Pro Gly Lys Pro Arg He 
165 170 175 

Asp Leu Val He Ala Lys Leu Pro Cys Asp Lys Ser Gly Lys Trp Ser 
180 185 190 

Arg Asp Val Ala Arg Leu His Leu Gin Leu Ala Ala Ala Arg Val Ala 
195 ' 200 205 

Ala Ser Ser Lys Gly Leu His Asn Val His Val He Leu Val Ser Asp 
210 " 215 220 

Cvs Phe Pro He Pro Asn Leu Phe Thr Gly Gin Glu Leu Val Ala Arg 
225 230 235 240 

Gin Gly Asn He Trp Leu Tyr Lys Pro Asn Leu His Gin Leu Arg Gin 
245 250 255 

Lys Leu Gin Leu Pro Val Gly Ser Cys Glu Leu Ser Val Pro Leu Gin 
260 265 270 

Ala Lys Asp Asn Phe Tyr Ser Ala Gly Ala Lys Lys Glu Ala Tyr Ala 
275 280 285 
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Thr lie Leu His Ser Ala Gin Phe Tyr Val Cys Gly Ala He Ala Ala 
290 295 300 

Ala Gin Ser He Arg Met Ser Gly Ser Thr Arg Asp Leu Val He Leu 
305 310 315 320 

Val Asp Glu Thr He Ser Glu Tyr His Lys Ser Gly Leu Val Ala Ala 
325 330 335 • 

Gly Trp Lys He Gin Met Phe Gin Arg He Arg Asn Pro Asn Ala Val 
340 345 350 

Pro Asn Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu Trp Gin 
355 360 365 

Leu Thr Glu Tyr Ser Lys He He Phe He Asp Ala Asp Met Leu He 
370 375 380 

Leu Arg Asn He Asp Phe Leu Phe Glu Phe Pro Glu He Ser Ala Thr 
385 390 395 400 

Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly Leu Met Val Val Glu Pro 
405 410 415 

Ser Asn Ser Thr Phe Gin Leu Leu Met Asp Asn He Asn Glu Val Val 
420 425 430 

Ser Tyr Asn Gly Gly Asp Gin Gly Tyr Leu Asn Glu He Phe Thr Trp 
435 440 445 

Trp His Arg He Pro Lys His Met Asn Phe Leu Lys His Phe Trp Glu 
450 ~ 455 460 

Glv Asp Glu Pro Glu He Lys Lys Met Lys Thr Ser Leu Phe Gly Ala 
465 470 475 480 

Asp Pro Pro He Leu Tyr Val Leu His Tyr Leu Gly Tyr Asn Lys Pro 
485 490 495 

Trp Leu Cys Phe Arg Asp Tyr Asp Cys Asn Trp Asn Val Asp He Phe 
500 505 • 510 

Gin Glu Phe Ala Ser Asp Glu Ala His Lys Thr Trp Trp Arg Val His 
515 520 525 

Asp Ala Met Pro Glu Asn Leu His Lys Phe Cys Leu Leu Arg Ser Lys 
530 535 _ 540 

Gin Lys Ala Gin Leu Glu Trp Asp Arg Arg Gin Ala Glu Lys Gly Asn 
545 550 555 560 

Tyr Lys Asp Gly His Trp Lys He Lys He Lys Asp .Lys Arg Leu Lys 
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565 570 575 

Thr Cys Phe Glu Asp Phe Cys Phe Trp Glu Ser Met Leu Trp His Trp 
580 585 590 

Gly Glu Thr Asn Ser Thr Asn Asn Ser Ser Thr Thr Thr Thr Ser Ser 
595 600 605 

Pro Pro His Lys Thr Ala Leu Pro Ser Leu 
610 615 



<210> 25 
<211> 1845 
<212> DNA 

<213> Oryza sativa 



<220> 

<221> CDS 

<222> (1) . . (1845) 



<400> 25 

atg ggg gtg acg ggc ggc gcc ggg gag gcc gtc aag ccg teg teg teg 

Met Gly Val Thr Gly Gly Ala Gly Glu Ala Val Lys Pro Ser Ser Ser 
1 5 10 15 

teg teg ttg teg ccg gtg gcg ggg ctg agg gcg gcg gcc ate gtg aag 
Ser Ser Leu Ser Pro Val Ala Gly Leu Arg Ala Ala Ala He Val Lys 
20 25 30 

ctg aac gcg gcg ttc etc gcc ttc ttc ttc etc gcg tac atg gcg etc 
Leu Asn Ala Ala Phe Leu Ala Phe Phe Phe Leu Ala Tyr Met Ala Leu 
35 40 45 



etc etc cac ccc aag tac tec tac etc etc gac cgc ggc gcc gcc tec 
Leu Leu His Pro Lys Tyr Ser Tyr Leu Leu Asp Arg Gly Ala Ala Ser 
50 55 60 



65 



ccg gcg atg ttc gac gag etc cgt ggg egg etg egg atg ggc ctg gtg 



48 



96 



144 



192 



tec etc gtc cgc tgc ace gcc ttc cgc gac gcc tgc acc ccg gcg acg 240 
Ser Leu Val Arg Cys Thr Ala Phe Arg Asp Ala Cys Thr Pro Ala Thr 

70 75 80 



288 



acg acc acc gcc cag etc tct egg aag ctg gga ggc gtg gcg gcg aac 

Thr Thr Thr Ala Gin Leu Ser Arg Lys Leu Gly Gly Val Ala Ala Asn 
85 30 95 

aag gcg gtg gcg gcg gcg gcg gag agg ate gtg aac gcc ggg agg gcg 336 

Lys Ala Val Ala Ala Ala Ala Glu Arg He Val Asn Ala Gly Arg Ala 

100 105 HO 



3 84 
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Pro Ala Met Phe Asp Glu Leu Arg Gly Arg Leu Arg Met Gly Leu Val 
115 120 125 

aac ate ggc cgc gac gag ctg ctg gcg etc ggc gtg gag ggc gac gec .432 
Asn lie Gly Arg Asp Glu Leu Leu Ala Leu Gly Val Glu Gly Asp Ala 
130 ' ' 135 140 



qtc ggc gtc gac ttc gag cgc gtc tec gac atg ttc egg tgg teg gac 
Val Gly val Asp Phe Glu Arg Val Ser Asp Met Phe Arg Trp Ser Asp 
145 150 155 



480 



etc ttc ccq gag tgg ate gac gag gag gag gac gac gag ggc ccg tec 528 
Leu Se Pro Glu He Asp Glu Glu Glu Asp Asp Glu Gly Pro Ser 



165 



170 175 



tgc ccg gag etc ccc atg ccg gac ttc tec egg tac ggc gac gtc gac 576 
Cvs Pro Glu Leu Pro Met Pro Asp Phe Ser Arg Tyr Gly Asp Val Asp 
180 185 19° 

gtg gtg gtg gcg teg ctg ccg tgc aac cgt teg gac gec gcg tgg aac 624 
Val Val Val Ala Ser Leu Pro Cys Asn Arg Ser Asp Ala Ala Trp Asn 

200 205 



195 



cgc gac gtg ttc agg ctg cag gtg cac etc gtg acg gcg cac atg gcg 
2g Lp Val Phe Arg Leu Gin Val His Leu Val Thr Ala Hrs Met Ala 

215 220 



672 



210 



aca cac aag ggg ctg egg cac gac gec ggc ggc ggc ggc ggc ggc ggg 720 
111 2g Lys Gly Leu Arg His Asp Ala Gly Gly Gly Gly Gly Gly Gly 

230 * 235 240 



225 



egg gtg cgc gtg gtg gtg cgc age gag tgc gag ccc atg atg gac ttg 
Arg val Arg Val Val Val Arg Ser Glu Cys Glu Pro Met Met Asp Leu 
245 ' 2 50 255 



ttc egg tgc gac gag gcg gtg ggg agg gac ggc gag tgg tgg atg tac 
Pne Arg Cys Asp Glu Ala Val Gly Arg Asp Gly Glu Trp Trp Met Tyr 
" " 260 265 270 



768 



816 



atg gtc gac gtc gag egg ctg, gag gag aag etc egg ctt cct gtc ggc 864 
Met Val Asp Val Glu Arg Leu Glu Glu Lys Leu Arg Leu Pro Val Gly 
275 280 285 

tea tgc aac etc gec eta cct ctg tgg gga ccc gga ggt ate cag gaa 912 
Ser Cys Asn Leu Ala Leu Pro Leu Trp Gly Pro Gly Gly He Gin Glu 
290 295 300 

gtg ttc aac gtg teg gag ctg acg gcg gcg gcg gca acg gcg ggg egg 960 
Val Phe Asn Val Ser Glu Leu Thr Ala Ala Ala Ala Thr Ala Gly Arg 
305 310 315 320 

ccg egg egg gag gcg tac gcg acg gtg etc cac tog teg gac acg tac 1008 
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Pro Arg Arg Glu Ala Tyr Ala Thr Val Leu His Ser Ser Asp Thr Tyr 
325 330 335 

ctg tgc ggc gcg ate gtg ctg gcg cag age ate egg cgc gee ggg teg 1056 
Leu Cys Gly Ala He Val Leu Ala Gin Ser He Arg Arg Ala Gly Ser 
340 345 350 

acg cgc gac etc gtc etc etc cac gac cac ace gtg teg aag ccg gcg 1104 
Thr Arg Asp Leu Val Leu Leu His Asp His Thr Val Ser Lys Pro Ala 
355 360 365 

ctg gcg gcg ctg gtc gec gec ggc tgg acc ccg cgc aag ate aag cgc 1152 
Leu Ala Ala Leu Val Ala Ala Gly Trp Thr Pro Arg Lys He Lys Arg 
370 375 380 

ate cgc aac ccg cgc gcg gag cgc ggc acc tac aac gag tac aac tac 1200 
He Arg Asn Pro Arg Ala Glu Arg Gly Thr Tyr Asn Glu Tyr Asn Tyr 
385 390 395 400 

age aag ttc egg ctg tgg cag etc acc gac tac gac cgc gtg gtg ttc 1248 
Ser Lys Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Arg Val Val Phe 
405 410 415 



gtc gac gee gac ate etc gtc etc cgc gac etc gac gee etc ttc ggc 
Val Asp Ala Asp He Leu Val Leu Arg Asp Leu Asp Ala Leu Phe Gly 
420 425 430 



ggg gtg atg gtg ate gag ccg teg cag tgc acg ttc cag teg ctg ate 
Gly Val Met Val He Glu Pro Ser Gin Cys Thr Phe Gin Ser Leu He 
450 455 460 



1296 



ttc ccg cag ctg acg gcg gtg ggc aac gac ggc teg etc ttc aac tec 1344 
Phe Pro Gin Leu Thr Ala Val Gly Asn Asp Gly Ser Leu Phe Asn Ser 
435 440 445 



1392 



egg cag egg egg acc ate egg tec tac aac ggc ggc gat cag ggg ttc 1440 

Arg Gin Arg Arg Thr He Arg Ser Tyr Asn Gly Gly Asp Gin Gly Phe 
465 470 475 480 

ctg aac gag gtg ttc gtc tgg tgg cac egg ctg ccg egg egg gtg aac 14 88 

Leu Asn Glu Val Phe Val Trp Trp His Arg Leu Pro Arg Arg Val Asn 
485 490 495 

tac etc aag aac ttc tgg gcg aac act acg gcg gag egg gcg etc aag 1536 

Tyr Leu Lys Asn Phe Trp Ala Asn Thr Thr Ala Glu Arg Ala Leu Lys 
500 505 510 

gag egg ctg ttc egg gcg gat ccc gcg gag gtg tgg teg ate cac tac 1584 

Glu Arg Leu Phe Arg Ala Asp Pro Ala Glu Val Trp Ser lie His Tyr 

515 . 520 525 

ctg ggg ctg aag ccg tgg acg tgc tac cgc gac tac gac tgc aac tgg 1632 
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Leu Gly Leu Lys Pro Trp Thr Cys Tyr Arg Asp Tyr Asp Cys Asn Trp 
530 535 540 

aac ate ggc gac cag egg gtg tac gec age gac gee gcg cac gcg egg 1680 
Asn lie Gly Asp Gin Arg Val Tyr. Ala Ser Asp Ala Ala His Ala Arg 
545 550 555 560 

tgg tgg cag gtg tac gac gac atg ggg gag gee atg cgc teg ccg tgc 1728 
Trp Trp Gin Val Tyr Asp Asp Met Gly Glu Ala Met Arg Ser Pro Cys 
565 570 575 

cgc ctg teg gag egg agg aag ate gag ate gee tgg gac cga cac etc 1776 
Arg Leu Ser Glu Arg Arg Lys lie Glu lie Ala Trp Asp Arg His Leu 
5B0 585 590 

gec gag gag gec ggc ttc tec gac cac cac tgg aag ate aac ate acc 1824 
Ala Glu Glu Ala Gly Phe Ser Asp. His His Trp Lys lie Asn lie Thr 
595 600 605 

gac ccc cgc aag tgg gag tag 1845 
Asp Pro Arg Lys Trp Clu * 

610. 



<210> 26 
<211> 614 
<212> PRT 

<213> Oryza sativa 
<400> 26 

Met Gly Val Thr Gly Gly Ala Gly Glu Ala Val Lys Pro Ser Ser Ser 

1 5 10 15 

Ser Ser Leu Ser Pro Val Ala Gly Leu Arg Ala Ala Ala lie Val Lys 

20 25 30 

Leu Asn Ala Ala Phe Leu Ala Phe Phe Phe Leu Ala Tyr Met Ala Leu 

35 40 45 

Leu Leu His Pro Lys Tyr Ser Tyr Leu Leu Asp Arg Gly Ala Ala Ser 

50 55 60 

Ser Leu Val Arg Cys Thr Ala Phe Arg Asp Ala Cys Thr Pro Ala Thr 
65 70 75 80 

Thr Thr Thr Ala Gin Leu Ser Arg Lys Leu Gly Gly Val Ala Ala Asn 

85 m 90 95 

Lys Ala Val Ala Ala Ala Ala Glu Arg He Val Asn Ala Gly Arg Ala 

100 105 110 

Pro Ala Met Phe Asp Glu Leu Arg Gly Arg Leu Arg Met Gly Leu Val 

115 120 125 

Asn He Gly Arg Asp Glu Leu Leu Ala Leu Gly Val Glu Gly Asp Ala . 

130 . 135 140 

Val Gly Val Asp Phe Glu Arg Val Ser Asp Met Phe Arg Trp Ser Asp 
145 150 155 160 

Leu Phe Pro Glu Trp lie Asp Glu Glu Glu Asp Asp Glu Gly Pro Ser 
165 170 175 
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Cys 


Pro 


Glu 


Leu 


Pro 


Met 


Pro 


Asp 


Phe 


Ser 


Arg 


Tyr 


Gly 


Asp 


Val 


Asp 








180 










185 










190 






Val 


Val 


Val 


Ala 


Ser 


Leu 


Pro 


Cys 


Asn 


Arg 


Ser 


Asp 


Ala 


Ala 


Trp 


Asn 






195 










200 










205 








Arg 


Asp 


Val 


Ph'e 


Arg 


Leu 


Gin 


Val 


His 


Leu 


Val 


Thr 


Ala 


His 


Met 


Ala 




210 










215 










220 










Ala 


Arg 


Lys 


Gly 


Leu 


Arg 


His 


Asp 


Ala 


Gly 


Gly Gly Gly 


Gly Gly Gly 


225 










230 










235 










240 


Arg 


Val 


Arg 


Val 


Val 


Val 


Arg 


Ser 


Glu 


Cys 


Glu 


Pro 


Met 


Met 


Asp 


Leu 










245 










250 










255 




Phe 


Arg 


Cys 


Asp 


Glu 


Ala 


Val 


Gly 


Arg 


Asp 


Gly 


Glu 


Trp 


Trp 


Met 


Tyr 








260 










265 










270 






Met 


Val 


Asp 


Val 


Glu Arg 


Leu 


Glu 


Glu 


Lys 


Leu Arg 


Leu 


Pro 


Val 


Gly 






275 










280 










285 








Ser 


Cys 


Asn 


Leu 


Ala 


Leu 


Pro 


Leu 


Trp 


Gly 


Pro Gly Gly 


He 


Gin 


Glu 




290 










295 










300 










Val 


Phe 


Asn 


Val 


Ser 


Glu 


Leu 


Thr 


Ala 


Ala 


Ala 


Ala 


Thr 


Ala 


Gly Arg 


305 










310 










315 










320 


Pro 


Arg 


Arg 


Glu 


Ala 


Tyr 


Ala 


Thr 


Val 


Leu 


His 


Ser 


Ser 


Asp 


Thr 


Tyr 










325 










330 










335 




Leu 


Cys 


Gly 


Ala 


He 


Val 


Leu 


Ala 


Gin 


Ser 


He 


Arg 


Arg 


Ala 


Gly Ser 








340 










345 










350 






Thr 


Arg 


Asp 


Leu 


Val 


Leu 


Leu 


His 


Asp 


His 


Thr 


Val 


Ser 


Lys 


Pro 


Ala 






355 










360 










365 








Leu 


Ala 


Ala 


Leu 


Val 


Ala 


Ala 


Gly 


Trp 


Thr 


Pro 


Arg 


Lys 


He 


Lys 


Arg 




370 










375 










380 










lie 


Arg 


Asn 


Pro 


Arg 


Ala 


Glu 


Arg 


Gly 


Thr 


Tyr 


Asn 


Glu 


Tyr 


Asn 


Tyr 


385 










390 










395 










400 


Ser 


Lys 


Phe 


Arg 


Leu 


Trp 


Gin 


Leu 


Thr 


Asp 


Tyr 


Asp 


Arg 


Val 


Val 


Phe 










405 










410 










415 




Val 


Asp 


Ala 


Asp 


He 


Leu 


Val 


Leu 


Arg 


Asp 


Leu 


Asp 


Ala 


Leu 


Phe 


Gly 








420 










425 










430 






Phe 


Pro 


Gin 


Leu 


Thr 


Ala 


Val 


Gly 


Asn 


Asp 


Gly 


Ser 


Leu 


Phe 


Asn 


Ser 






435 










440 










445 








Gly 


Val 


Met 


Val 


He 


Glu 


Pro 


Ser 


Gin 


Cys 


Thr 


Phe 


Gin 


Ser 


Leu 


He 




450 










455 










460 










Arg 


Gin 


Arg 


Arg 


Thr 


He 


Arg 


Ser 


Tyr 


Asn 


Gly Gly Asp 


Gin Gly Phe 


465 










470 










475 










480 


Leu 


Asn 


Glu 


Val 


Phe 


Val 


Trp 


Trp 


His 


Arg 


Leu 


Pro 


Arg 


Arg 


Val 


Asn 










485 










490 










495 




Tyr 


Leu 


Lys 


Asn 


Phe 


Trp 


Ala 


Asn 


Thr 


Thr 


Ala 


Glu 


Arg 


Ala 


Leu 


Lys 








500 










505 










510 






Glu 


Arg 


Leu 


Phe 


Arg 


Ala 


Asp 


Pro 


Ala 


Glu 


Val 


Trp 


Ser 


He 


His 


Tyr 






515 










520 










525 








Leu 


Gly 


Leu 


Lys 


Pro 


Trp 


Thr 


Cys 


Tyr 


Arg 


Asp 


Tyr 


Asp 


Cys 


Asn 


Trp 




530 










535 










540 










Asn 


lie 


Gly 


Asp 


Gin 


Arg 


Val 


Tyr 


Ala 


Ser 


Asp 


Ala 


Ala 


His 


Ala 


Arg 


545 










550 










555 










560 


Trp 


Trp 


Gin 


Val 


Tyr 


Asp 


Asp 


Met 


Gly 


Glu 


Ala 


Met 


Arg 


Ser 


Pro 


Cys 










565 










570 










57S 




Arg 


Leu 


Ser 


Glu 


Arg 


Arg 


Lys 


He 


Glu 


He 


Ala 


Trp 


Asp 


Arg 


His 


Leu 








580 










585 










590 


j 





jr * - 
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Ala Glu Glu Ala Gly Phe Ser Asp His His Trp Lys lie Asn He Thr 
595 

Asp Pro Arg Lys Trp Glu 
610 



600 605 



<210> 27 

<211> 626 

<212> DNA 

<213> Zea mays 

<220> 

<221> CDS 

<222> (133) . . (624) 

<400> 27 



tccgagcggc cgccccgggc aggtacaaac ctgacgtgaa ggctctaaag gagaagctca 60 

ggctgcctgt tggttcctgt gagcttgctg ttccactcaa cgcaaaagca cgactcttac 120 

acqgtagaca ga cgc aga gaa gca tat get aca ata ctt cat tea gca agt 171 
aeggtaga g ^ ^ ^ ^ ^ ^ ^ ^ ^ H±s Ser Ala Ser 



1 5 



gaa tat gtt tgc ggt gcg ata aca gca get caa age att cgt caa gca 
Slu Tyr Sal Cys Gly Ala He Thr Ala Ala Gin Ser He Arg Gin Ala 



15 20 

gga tea aca aga gac ctt gtt att ctt gtt gat gac acc ata agt gac 
fly Ser Thr Arg Asp Leu Val He Leu Val Asp Asp Thr He Ser Asp 
30 3 5 40 

cae cae cge aag ggg ctg gaa tct get ggg tgg aag gtc aga ata ata 
His His Arg Lys Gly Leu Glu Ser Ala Gly Trp Lys Val Arg lie He 
50 55 60 

caa agg ate egg aat ccc aaa gec gaa cgt gat gee tae aae gaa tgg 
llu Arg He Ari Asn Pro Lys Ala Glu Arg Asp Ala Tyr Asn Glu Trp 



65 



70 



aac tae age aaa ttc egg ctg tgg cag ctt aca gat tac gac aag gtt 
Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Lys Val 



80 



85 90 



att ttc att gat get gat ctg etc ate ctg agg aac att gat ttc ttg 
He Phe He Asp Ala Asp Leu Leu lie Leu Arg Asn He Asp Phe Leu 
95 100 105 

ttt gca atg cca gaa ate acc gca act ggg aac aat gec aca etc ttc 
Phe Ala Met Pro Glu He Thr Ala Thr Gly Asn Asn Ala Thr Leu Phe 



219 



267 



315 



363 



411 



459 



507 



110 



115 120 I 25 



WO 03/014365 



PCT/GB02/03636 



50 

aac tct ggg gtg atg gtc att'gaa cct tea aac tgc acg ttc cag tta 555 

III Ser Sy Val Met Val He Glu Pro Ser Asn Cys Thr Phe Gin Leu 
130 135 140 

ctg atg gag cac ate aac gag ata aca tct tac aac ggt ggt gac caa 603 
Leu Met Glu His He Asn Glu He Thr Ser Tyr Asn Gly Gly Asp Gin 
145 I 50 155 

626 

ggg tac etc ggc cgc gac cac gc 
Gly Tyr Leu Gly Arg Asp His 
160 



<210> 28 
<211> 164 
<212> PRT 
<213> 2ea mays 

Arq°Arg 8 Glu Ala Tvr Ala Thr He Leu His Ser Ala Ser Glu Tyr Val 
1 5 10 15 

Cys Gly Ala He Thr Ala Ala Gin Ser He Arg Gin Ala Gly Ser Thr 
20 25 30 

Arg Asp Leu Val He Leu Val Asp Asp Thr He Ser Asp His His Arg 
35 40 45 

Lys Gly Leu Glu Ser Ala Gly Trp Lys Val Arg He He Glu Arg He 
50 55 60 



Arq Asn Pro Lys Ala Glu Arg Asp Ala Tyr Asn Glu Trp Asn Tyr Ser 
65 70 75 80 

Lys Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Lys Val He Phe He 

85 90 95 

Asp Ala Asp Leu Leu He Leu Arg Asn He Asp Phe Leu Phe Ala Met 
100 105 11° 

Pro Glu He Thr Ala Thr Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly 
115 120 125 

Val Met Val He Glu Pro Ser Asn Cys Thr Phe Gin Leu Leu Met Glu 
130 135 140 



His He Asn Glu He Thr Ser Tyr Asn Gly Gly Asp Gin Gly Tyr Leu 
145 



150 155 -160 



Gly Arg Asp His 
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<210> 29 
<211> 553 . 
<212> DNA 
<213> Zea mays 

<.220> 

<221> CDS 

<222> (1) - • (552) 

<400> 29 

tgg aag gtc aga ata ata gaa agg ate egg aat ccc aaa gee gaa cgt 

Trp Lys Val Arg lie lie Glu Arg lie Arg Asn Pro Lys Ala Glu Arg 

1 ■ 5 10 15 

gat gee tac aac gaa tgg aac tac age aaa ttc egg ctg tgg cag ctt 
Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu 
20 25 30 

aca gat tac gac aag gtt att ttc att gat get gat ctg etc ate ctg 
Thr Asp Tyr Asp Lys Val lie Phe He Asp Ala Asp Leu Leu He Leu 
35 40 45 

agg aac att gat ttc ttg ttt gca atg cca gaa ate acc gca act ggg 
Arg Asn He Asp Phe Leu Phe Ala Met Pro Glu He Thr Ala Thr Gly 



50 



55 60 



aac aat gec aca etc ttc aac tct ggg gtg atg gtc att gaa cct tea 
Asn Asn Ala Thr Leu Phe Asn Ser Gly Val Met Val He Glu Pro Ser 

n n 7 5 80 

aac tgc acg ttc cag tta ctg atg gag cac ate aac gag ata aca tct 
Asn Cys Thr Phe Gin Leu Leu Met Glu His He Asn Glu He Thr Ser 
85 90 95 

tac aac ggt ggt gac caa ggg tac ctg aac gag ata ttc aca tgg tgg 
Tyr Asn Gly Gly Asp Gin Gly Tyr Leu Asn Glu He Phe Thr Trp Trp 
100 105 HO 

cac egg att cca aag cac atg aat ttc ttg aag cat ttc tgg gag ggt 
His Arg He Pro Lys His Met Asn Phe Leu Lys His Phe Trp Glu Gly 
115 120 125 

gat gag gac gaa gtg aag gec aag aag act egg ctg ttc ggc gec aac 
Asp Glu Asp Glu Val Lys Ala Lys Lys Thr Arg Leu Phe Gly Ala Asn 
13 0 13 5. 14 0 



cca ccg ate etc tac gtt etc cac tac ttg ggg egg aag cca tgg ctg 
Pro Pro He Leu Tyr Val Leu His Tyr Leu Gly Arg Lys Pro Trp Leu 
145 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



150 155 160 
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tac ttc egg gac tac gat tgc aac tgg aac gtc gag ate ttg egg gag 52 8 

C? s Phe Arg Asp Tyr Asp Cys As, Trp Asn Val Glu He Leu Arg Glu 
165 170 175 

ttt gcg agt gac gtt gcg cat gee c 
Phe Ala Ser Asp Val Ala His Ala 
180 



<210> 30 
<211> 184 
<212> PPT 
<213> Zea mays 

J^°i.ys 0 Val Arg He He Glu Arg He Arg Asn Pro Lys Ala Glu Arg 
i 5 10 15 



Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu 
20 2 5 ■ 30 

Thr Asp Tyr Asp Lys 
35 

Arg Asn He Asp Phe Leu Phe Ala Met Pro Glu He Thr Ala Thr Gly 



20 25 

Val He Phe He Asp Ala Asp Leu Leu He Leu 
40 45 



55 60 



50 

Asn Asn Ala Thr Leu Phe Asn Ser Gly Val Met Val He Glu Pro Ser 
65 



70 75 80 



Asn Cys Thr Phe Gin Leu Leu Met Glu His He Asn Glu He Thr Ser 
85 9° 95 

Tyr Asn Gly Gly Asp Gin Gly Tyr Leu Asn Glu He Phe Thr Trp Trp 
100 105 H° 

His Arg He Pro Lys His Met Asn Phe Leu Lys His Phe Trp Glu Gly 
115 120 I 25 

Asp Glu Asp Glu Val Lys Ala Lys Lys Thr Arg Leu Phe Gly Ala Asn 
130 135 140 

Val Leu His Tyr Leu Gly Arg Lys Pro Trp Leu 



150 155 160 



Pro Pro He Leu Tyr 

Cys Phe Arg Asp Tyr Asp Cys Asn Trp Asn Val Glu He Leu Arg Glu 
165 170 175 



Phe Ala Ser Asp Val Ala His Ala 
. 180 
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<210> 31 
<211> 552 
<212> DNA 
<213> Zea mays 

<220> 

<221> CDS 

<222> (1) . . (552) 



tcc°ct| 1 cgc egg etc age ccc aac gec gac cgc gtc gtc ate gcg tec 4 8 
Ser Leu Arg Arg Leu Ser Pro Asn Ala Asp Arg Val Val lie Ala Ser 



! 5 10 15 

etc gac gtc ccg ccg etc tgg gtt cag gca ctg aaa aat gac ggg gta 
Leu Asp Val Pro Pro Leu Trp Val Gin Ala Leu Lys Asn Asp Gly Val 
20 25 30 

aag gtg gtc tct gtg gag aat ttg aaa aat cct tac gag aaa caa gaa 
Lys Val Val Ser Val Glu Asn Leu Lys Asn Pro Tyr Glu Lys Gin Glu 
35 40 45 

aat ttc aac aga cga. etc aaa ttg act tta aac aag ctg tat gca tgg 
Asn Phe Asn Arg Arg Phe Lys Leu Thr Leu Asn Lys Leu Tyr Ala Trp 
50 ' 55 60 



gtc ttc ate aat ccc tgt ate ttc cat aca ggt ctt ttt gtg ctt cag 
Val Phe lie Asn Pro Cys lie Phe His Thr Gly Leu Phe Val Leu Gin 
100 105 11° 

ccc tea atg gat gtt ttt aag aac atg eta cat gag eta gcg gtt gga 
Pro Ser Met Asp Val Phe Lys Asn Met Leu His Glu Leu Ala Val Gly 
115 120 125 

cgt gaa aac cca gat ggg gca gac caa ggc ttc ctt get agt tat ttc 
Aro Glu Asn Pro Asp Gly Ala Asp Gin Gly Phe Leu Ala Ser Tyr Phe 
130 135 140 

Ccg gac ttg ctt gat cag cca atg ttc cat cca cca get aat ggt aca 
Pro Asp Leu Leu Asp Gin Pro Met Phe His Pro Pro Ala Asn Gly Thr 
145 150 155 160 



96 



144 



192 



240 



age ttg gtt tea tat gag cga gtt gtt atg ctt gac tct gac aac att 
Ser Leu Val Ser Tyr Glu Arg Val Val Met Leu Asp Ser Asp Asn lie 
65 70 75 80 

ttc etc caa aat act gat gag tta ttt cag tgt ggt cag ttc tgt get 288 
Phe Leu Gin Asn Thr Asp Glu Leu Phe Gin Cys Gly Gin Phe Cys Ala 
85 ' 90 95 



336 



384 



432 



480 



aaa 



ctt tgg ggt act tat cgc etc ccc eta ggc tac cag atg gat gca 52 8 
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Lys Leu Trp Gly Thr Tyr Arg Leu Pro Leu Gly Tyr Gin Met Asp Ala 
165 170 175 

552 

tct tac tat tat ctg aag ctt cgc 
Ser Tyr Tyr Tyr Leu Lys Leu Arg 
180 



<210> 32 
<211> 184 
<212> PRT 
<213> Zea mays 

<400> 32 

Ser Leu Arg Arg Leu Ser Pro Asn Ala Asp Arg Val Val He Ala Ser 
! 5 10 I 5 

Leu Asp val Pro Pro Leu Trp Val Gin Ala Leu Lys Asn Asp Gly Val 
20 25 30 

Lys Val Val Ser Val Glu Asn Leu Lys Asn Pro Tyr Glu Lys Gin Glu 
35 40 45 

Asn Phe Asn Arg Arg Phe Lys Leu Thr Leu Asn Lys Leu Tyr Ala Trp 
50 55 60 

Ser Leu Val Ser Tyr Glu Arg Val Val Met Leu Asp Ser Asp Asn He 
65 ' 70 75 80 

Phe Leu Gin Asn Thr Asp Glu Leu Phe Gin Cys Gly Gin Phe Cys Ala 

85 90 95 

Val Phe .lie Asn Pro Cys He Phe His Thr Gly Leu Phe Val Leu Gin 
100 105 HO 

Pro Ser Met Asp Val Phe Lys Asn Met Leu His Glu Leu Ala Val Gly 
115 120 125 

Arg Glu Asn Pro Asp Gly Ala Asp Gin Gly Phe Leu Ala Ser Tyr Phe 
130 135 140 

Pro Asp Leu Leu Asp Gin Pro Met Phe His Pro Pro Ala Asn Gly Thr 
145 * 150 155 160 

Lys Leu Trp Gly Thr Tyr Arg Leu Pro Leu Gly Tyr Gin Met Asp Ala 
165 170 175 



Ser Tyr Tyr Tyr Leu Lys Leu Arg 
180 
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<210> 33 
<211> 560 
<212> DNA 
<213> Zea mays 

<220> 

<221> CDS 

<222> (1) . - (558) 



aaa cct gac gtg aag gcg ttg aag gag aag etc agg ctg cct gtt ggt 
Lys Pro Asp val Lys Ala Leu Lys Glu Lys Leu Arg Leu Pro Val Gly 
1 5 10 15 

tec tgt gag ctt get gtt cca etc aac gca aaa gca cga etc tac aca 
Ser Cys Glu Leu Ala Val Pro Leu Asn Ala Lys Ala Arg Leu Tyr Thr 

gta gac aga cgc aga gaa gca tat gcg aca ata ctg cat tea gca agt 
val Asp Arg Arg Arg Glu Ala Tyr Ala Thr He Leu Hxs Ser Ala Ser 



35 



40 45 



gaa tat gtt tgc ggc gcg ate acg gca get caa age att cgt caa gca 
Glu Tyr Val Cys Gly Ala He Thr Ala Ala Gin Ser He Arg Gin Ala 
50 " 55 60 

gga tea aca aga gac etc gtt att etc gtc gac gac ace ata agt gac 
Gly Ser Thr Arg Asp Leu Val He Leu Val Asp Asp Thr He Ser Asp 
65 70 75 80 

cac cac cgc aag ggg ctg caa tct gcg ggg tgg aag gtc agg ata ata 
His His Arg Lys Gly Leu Gin Ser Ala Gly Trp Lys Val Arg He He 
85 90 95 

cag agg ate egg aac ccc aaa gee gag cgc gac gee tac aac gag tgg 
Gin Arg He Arg Asn Pro Lys Ala Glu Arg Asp Ala Tyr Asn Glu Trp 
100 105 HO 

aac tac age aaa ttc egg ctg tgg cag etc acg gat tac gac aag gtc 
Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Lys Val 
115 120 125 

ate ttc ate gac gcg gat etc etc ate ctg agg aac ate gat ttc ctg 
He Phe He Asp Ala Asp Leu Leu He Leu Arg Asn He Asp Phe Leu 



130 



135 140 



ttc 'gcg ctg ccg gag ate acg gcg acg ggg aac aac gcg acg etc ttc 
Phe Ala Leu Pro Glu lie Thr Ala Thr Gly Asn Asn Ala Thr Leu Phe 



145 



150 



155 16° 



48 



96 



aac teg gga gtg atg gtc ate gag cct teg aac tgc acg ttc egg eta 
Asn Ser Gly Val Met Val He Glu Pro Ser Asn Cys Thr Phe Arg Leu 



144 



192 



240 



288 



336 



384 



432 



480 



528 
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165 170 175 

560 



ctg atg gag cac ate gac gag ata acg teg ta 
Leu Met Glu His He Asp Glu He Thr Ser 
180 185 



<210> 34 
<211> 186 
<212> PRT 
<213> Zea mays 

<400> 34 Ty _ 

Lys Pro Asp Val Lys Ala Leu Lys Glu Lys Leu Arg Leu Pro Val Gly 
x 5 10 15 

Ser Cys Glu Leu Ala Val Pro Leu Asn Ala Lys Ala Arg Leu Tyr Thr 
20 25 30 

Val Asp Arg Arg Arg Glu Ala Tyr Ala Thr He Leu His Ser Ala Ser 
35 40 45 

Glu Tyr Val Cys Gly Ala He Thr Ala Ala Gin Ser He Arg Gin Ala 
50 J 55 60 

Gly Ser Thr Arg Asp Leu Val He Leu Val Asp Asp Thr He Ser Asp 



65 



70 75 80 



His His Arg Lys Gly Leu Gin Ser Ala Gly Trp Lys Val Arg He He 
85 90 95 

Gin Arg He Arg Asn Pro Lys Ala Glu Arg Asp Ala Tyr Asn Glu Trp 
100 105 HO 

Asn Tyr Ser Lys Phe Arg Leu Trp Gin Leu Thr Asp Tyr Asp Lys Val 
115 120 125 

He Phe He Asp Ala Asp Leu Leu He Leu Arg Asn He Asp Phe Leu 
130 135 140 

Phe Ala Leu Pro Glu He Thr Ala Thr Gly Asn Asn Ala Thr Leu Phe 
145 150 155 160 

Asn Ser Gly Val Met Val He Glu Pro Ser Asn Cys Thr Phe Arg Leu 
165 170 175 

Leu Met Glu His He Asp Glu He Thr Ser 
180 185 



<210> 35 
<211> 566 
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<212> PRT 

<213> Arabidopsis thaliana 
<400> 35 

Met Gly Ala Lys Ser Lys Ser Ser Ser Thr Arg Phe Phe Met Phe Tyr 
1 5 10 15 

Leu lie Leu lie Ser Leu Ser Phe Leu Gly Leu Leu Leu Asn Phe Lys 
20 25 30 

Pro Leu Phe Leu Leu Asn Pro Met He Ala Ser Pro Ser He Val Glu 
■ 35 40 45 

He Arg Tyr Ser Leu Pro Glu Pro Val Lys Arg Thr Pro lie Trp Leu 
50 55 60 

Arg Leu He Arg Asn Tyr Leu Pro Asp Glu Lys Lys He Arg Val Gly 
65 70 75 80 

Leu Leu Asn He Ala Glu Asn Glu Arg Glu Ser Tyr Glu Ala Ser Gly 
85 90 95 

Thr Ser He Leu Glu Asn Val His Val Ser Leu Asp Pro Leu Pro Asn 
100 105 110 

Asn Leu Thr Trp Thr Ser Leu Phe Prp Val Trp He Asp Glu Asp His 
115 120 125 

Thr Trp His lie Pro Ser Cys Pro Glu Val Pro Leu Pro Lys Met Glu 
130 135 140 

Gly Ser Glu Ala Asp Val Asp Val Val Val Val Lys Val Pro Cys Asp 
145 150 155 160 

Gly Phe Ser Glu Lys Arg Gly Leu Arg Asp Val Phe Arg Leu Gin Val 
165 170 , 175 

Asn Leu Ala Ala Ala Asn Leu Val Val Glu Ser Gly Arg Arg Asn Val 
180 185 190 

Asp Arg Thr Val Tyr Val Val Phe He Gly Ser Cys Gly Pro Met His 

195 200 205 

Glu He Phe Arg Cys Asp Glu Arg Val Lys Arg Val Gly Asp Tyr Trp 
210 215 220 

Val Tyr Arg Pro Asp Leu Thr Arg Leu Lys Gin Lys Leu Leu Met Pro 
225 • 230 235 240 

Pro Gly Ser Cys Gin He Ala Pro Leu Gly Gin Gly Glu Ala Trp lie 
245 250 * 255 
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Gin Asp Lys Asn Arg Asn Leu Thr Ser Glu Lys Thr Thr Leu Ser Ser 
260 265 270 

Phe Thr Ala Gin Arg Val Ala Tyr Val Thr Leu Leu His Ser Ser Glu 
275 280 285 

Val Tyr Val Cys Gly Ala He Ala Leu Ala Gin Ser He Arg Gin Ser 
290 295 300 

Gly Ser Thr Lys Asp Met He Leu Leu His Asp Asp Ser He Thr Asn 
305 310 315 320 

He Ser Leu He Gly Leu Ser Leu Ala Gly Trp Lys Leu Arg Arg Val 
325 330 335 

Glu Arg He Arg Ser Pro Phe Ser Lys Lys Arg Ser Tyr Asn Glu Trp 
340 345 350 

Asn Tyr Ser Lys Leu Arg Val Trp Gin Val Thr Asp Tyr Asp Lys Leu 
355 360 365 

Val Phe He Asp Ala Asp Phe He He Val Lys Asn He Asp Tyr Leu 
370 375 380 

Phe Ser Tyr Pro Gin Leu Ser Ala Ala Gly Asn Asn Lys Val Leu Phe 
385 390 395 400 

Asn Ser Gly Val Met Val Leu Glu Pro Ser Ala Cys Leu Phe Glu Asp 
405 410 415 

Leu Met Leu Lys Ser Phe Lys He Gly Ser Tyr Asn Gly Gly Asp Gin 
420 425 430 

Gly Phe Leu Asn Glu Tyr Phe Val Trp Trp His Arg Leu Ser Lys Arg 
435 440 445 

Leu Asn Thr Met Lys Tyr Phe Gly Asp Glu Ser Arg His Asp Lys Ala 
450 455 460 

Arg Asn Leu Pro Glu Asn Leu Glu Gly He His Tyr Leu Gly Leu Lys 
465 470 475 480 

Pro Trp Arg Cys Tyr Arg Asp Tyr Asp Cys Asn Trp Asp Leu Lys Thr 
485 490 495 

Arg Arg Val Tyr Ala Ser Glu Ser Val His Ala Arg Trp Trp Lys Val 
500 505 510 

Tyr Asp Lys Met Pro Lys Lys Leu Lys Gly Tyr Cys Gly Leu Asn Leu 
515 520 525 



Lys Met Glu Lys Asn Val Glu Lys Trp Arg Lys Met Ala Lys Leu Asn 



WO 03/014365 PCT/GB02/03636 



59 

. 530 535 540 

Gly Phe Pro Glu Asia His Trp Lys lie Arg lie Lys Asp Pro Arg Lys 
545 550 555 560 



Lys Asn Arg Leu Ser Glu 
565 



7§r 




